The simple way of Text Mining is to analyze the number of words and the Co-occurrence among words.
The next step is to analyze the relationship among texts.
The point of the approach to analyze texts is that we use texts as the samples of the whole data. By this approach, we can use the methods of Analysis of Similarity of Samples for this analysis.
A text is made by sentences. A sentence is made by words. "Text" and "sentence" are unit to study the text (sample).
Outputs of analysis is different by the units.
MS Excel has the function to change a text into sentences using ".(period)".
The figure is the example of
text mining data.
We analyze
Co-occurrence
by this matrix.
We do not read texts as the group of words in the daily use. And if we expect to have the methods for automatic translation or communication, we need more technology.
But the technology to use texts as groups of words has the potential to change the world.
NEXT Graphical Statics