Guess content semantic with SOLR
I don’t know whether it can be named semantic. SO what I want to do is create an Lucene index with two fields: word, category (multiValued). Then we can pass a bunch of text to it and retrieve the score. It should output something like this:
Word/Category/Score
election/politics/1
obama/politics/1
microsoft/technology/1
microsoft/business/1
Then we sum up the score: politics/2, technology/1,business/1
We may then guess that the content was about politics :D. LOL. Can we do something like that?