GMU:Critical VR Lab I/L.-E. Kuehr: Difference between revisions

Line 33:

For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique ''word2vec''.

[[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]]

The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.

The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.

[[File:VRI-LEK-Graph.png|none|500px|Resulting Graph]]

The resulting data is then imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text.

@@ Line 33: / Line 33: @@
 For the texts to generate the word-clouds, the text of the posts and comments are extracted from the JSON files and are analyzed using the natural language processing technique ''word2vec''.
 [[File:VRI-LEK-Epochs.PNG|none|1000px|Learning Process]]
-The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three.dimensions the method ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.
+The resulting vector space is of very high dimensionality, thus cannot be easily visualized. To reduce the high dimensional space to three dimensions the method  ''t-distributed stochastic neighbor embedding'' is used, which keeps words close together that are close in the high dimensional space.
 [[File:VRI-LEK-Graph.png|none|500px|Resulting Graph]]
 The resulting data is then imported into unity using a ''csv''-file and for every data-point a billboard-text of the word is generated. This process is repeated for every text.