GMU:Critical VR Lab I/L.-E. Kuehr: Difference between revisions

From Medien Wiki
Line 33: Line 33:
For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique ''word2vec''.  
For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique ''word2vec''.  
[[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]]
[[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]]
The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.
The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.
[[File:VRI-LEK-Graph.png|none|500px|Resulting Graph]]
[[File:VRI-LEK-Graph.png|none|500px|Resulting Graph]]
The resulting data is then imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text.
The resulting data is then imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text.