Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -745,12 +745,12 @@ class DistributedLDAModel private[clustering] (
val N_wk = vertex._2
val smoothed_N_wk: TopicCounts = N_wk + (eta - 1.0)
val phi_wk: TopicCounts = smoothed_N_wk :/ smoothed_N_k
(eta - 1.0) * sum(phi_wk.map(math.log))
sumPrior + (eta - 1.0) * sum(phi_wk.map(math.log))
} else {
val N_kj = vertex._2
val smoothed_N_kj: TopicCounts = N_kj + (alpha - 1.0)
val theta_kj: TopicCounts = normalize(smoothed_N_kj, 1.0)
(alpha - 1.0) * sum(theta_kj.map(math.log))
sumPrior + (alpha - 1.0) * sum(theta_kj.map(math.log))
}
}
graph.vertices.aggregate(0.0)(seqOp, _ + _)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,14 @@ class LDASuite extends SparkFunSuite with MLlibTestSparkContext with DefaultRead
Vectors.dense(model2.topicsMatrix.toArray) absTol 1e-6)
assert(Vectors.dense(model.getDocConcentration) ~==
Vectors.dense(model2.getDocConcentration) absTol 1e-6)
val logPrior = model.asInstanceOf[DistributedLDAModel].logPrior
val logPrior2 = model2.asInstanceOf[DistributedLDAModel].logPrior
val trainingLogLikelihood =
model.asInstanceOf[DistributedLDAModel].trainingLogLikelihood
val trainingLogLikelihood2 =
model2.asInstanceOf[DistributedLDAModel].trainingLogLikelihood
assert(logPrior ~== logPrior2 absTol 1e-6)
assert(trainingLogLikelihood ~== trainingLogLikelihood2 absTol 1e-6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check trainingLogLikelihood and logPrior are not changing for LocalLDAModel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logLikelihood and logPrior are only for distributed model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right - I mean that they are not persisted & loaded into an unexpected but valid value (!= Double.NaN)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LocalLDAModel doesn't extend DistributedLDAModel and vice versa. I am not clear how to check trainingLogLikelihood and logPrior in LocalLDAModel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess I remember this wrong because of the other PR.

}
val lda = new LDA()
testEstimatorAndModelReadWrite(lda, dataset,
Expand Down