@semordnilap its funny how they started with a "swipe based" UI way back when and how both iOS and Android are now changing to that.

@semordnilap like a single layer on the input autoencoder on boolean bag of words. (i.e. boolean BoW->layer->autoencoding values -> layer ->same BoW again)

I think the encoded layer contains values how much it has different aspects/topics. The weights contain for each word how much it associated with each topic.

The output layer, i am not as sure, ostensibly it just learns to invert it back. That this is fit independently finds the weight again here might suggest it's not entirely optimal.

@semordnilap And then there is also Latent Dirichlet Allocation, which assumes different causes(like topic) to different words.

LDA does seem much less black-boxy than the neural net. Theory hard though.. For one, wikipedia says it is about a mixture, you'd expect topics to not exclude each other, but i dont't see through the math well enough to see if that's true.

From what i've read, there is no clear way to determine the best number of topics. Still, something like LDA might be superior.

Sign in to participate in the conversation
LinuxRocks.Online

Linux Geeks doing what Linux Geeks do..