Data is information that can be used as the basis for analysis. Sometimes data is easy for a human to understand in its raw form (for example, the text in documents could be considered data in the context of certain analysis work, and humans would typically find this easier to process than a computer algorithm). Other times data analysis, exploration and presentation techniques are required before the 'information' (e.g. meaning) within the data is informative to a human.
Data can be stored in varying degrees of ‘structure’, which determines how easily it can be understood by a computer or algorithm. Structured data is information that is highly ordered, typically tabular, can be easily ‘read’ by a computer and exists in predefined formats (often within a database). Semi-structured data is information that is not stored in a tightly defined format but has some level of organisation and standardisation (e.g. tagged images or documents).

Unstructured data usually describes information in its native form, namely how it appears in the real world. It has not been abstracted or standardised in a predefined way. Whilst often the most straightforward way for a human to consume information (and representing a large amount of valuable business data), unstructured data needs processing to become machine interpretable. To achieve this, a data scientist will often work to translate various forms of data into a higher degree of structure.
SEE LESS