Data Cleaning and Standardization

Accessible and Preservable File Formats

Researchers have many file format choices from txt to py to xlsx and more. FAIR data, however, prioritizes accessible and preservable file formats. Consider the following questions.

Consistency

Human brains are really good at reconciling small differences, but computers are not. Data must be consistent in order to be interoperable. To this end, please consider the following three items.

User Friendly Organization

Would you use data that is horribly organized? Of course not! Assume that your data users will not look at your documentation until they are forced to and follow these better practices.

  • Write Directions: Provide ample explanation and direction for your dataset.
  • Folder Structures: Organize files into folders and subfolders in such a way to make data easy to find.
  • Classification: Sort by importance, sensitivity, and more.
  • External Reviewers: When in doubt, ask for a review from someone else – preferably not involved in your project.

Resources and References