Pausanias Analysis

Sentence-level words and phrases that predict mythic vs. historical content

Model Performance Metrics

The following metrics are from the logistic regression classifier's performance on the test set:

Class Precision Recall F1-Score Support
Historical 0.777 0.780 0.779 255
Mythic 0.634 0.630 0.632 154
Overall Accuracy 0.724 409

Sentence Predictors of Mythic Content

Word/Phrase Coefficient Mythic Count Non-mythic Count p-value q-value
θησεὺς 2.3105 26 1 6.04e-11 6.04e-08
εἶναι 2.0600 82 49 4.48e-10 2.24e-07
λέγουσιν 1.9842 52 25 2.03e-08 4.06e-06
θησέως 1.9725 24 2 3.71e-09 1.1e-06
παῖδα 1.7512 36 13 1.09e-07 1.36e-05
ἰλίου 1.6861 12 0 5.48e-06 0.000422
ὑπὸ 1.6794 90 83 1.58e-05 0.000928
φασιν 1.6274 25 7 1.33e-06 0.000148
θησέα 1.5370 19 0 4.42e-09 1.1e-06
ὄνομα 1.5124 32 14 4.51e-06 0.000376
αὐτὸν 1.4947 39 24 3.58e-05 0.00171
αὐτὴν 1.4858 29 16 0.000131 0.00467
ἀχιλλέως 1.4736 16 0 9.41e-08 1.34e-05
τὸν 1.4653 165 190 2.72e-05 0.00143
γενέσθαι 1.3630 36 25 0.000315 0.00872
εἰσιν 1.3227 16 17 0.161 0.478
ὅμηρος 1.3213 15 1 2.75e-06 0.00025
λέγουσι 1.2954 31 14 8.97e-06 0.000641
ἀποθανεῖν 1.2744 23 3 5.44e-08 9.06e-06
λέγεται 1.2107 29 16 0.000131 0.00467
ἡρακλέους 1.1844 17 2 2.53e-06 0.00025
κάθηται 1.1660 6 0 0.00238 0.0396
πρῶτον 1.1588 22 9 9.13e-05 0.0038
μνῆμά 1.1539 10 3 0.00621 0.069
ἡρῷον 1.1354 9 1 0.000772 0.0184
λόγον 1.1086 12 3 0.000738 0.018
ἀλκάθου 1.0890 7 0 0.000866 0.0197
φασὶν 1.0742 14 7 0.00503 0.0621
μάχην 1.0490 10 6 0.0353 0.207
τὸ ὄνομα 1.0425 17 4 3.86e-05 0.00175

Sentence Predictors of Historical Content

Word/Phrase Coefficient Mythic Count Non-mythic Count p-value q-value
οἱ -1.8190 107 273 0.000118 0.00453
ἄγαλμα -1.6989 9 50 0.000223 0.00744
τὰ -1.4273 80 225 3.07e-05 0.00154
σφισιν -1.3006 16 43 0.116 0.382
φωκέων -1.2292 0 22 5.82e-05 0.00253
μακεδόνων -1.1589 0 25 2.07e-05 0.00115
αὖθις -1.0726 3 26 0.00284 0.0427
πτολεμαῖον -1.0550 0 18 0.000323 0.00872
ἦν -1.0456 26 76 0.0141 0.122
μὴ -1.0312 7 24 0.0901 0.338
χαλκοῦν -0.9905 1 23 0.000383 0.0101
ἀλεξάνδρου -0.9895 1 17 0.00523 0.0624
ἀφροδίτης -0.9368 1 13 0.0236 0.172
μέν -0.9174 6 30 0.00779 0.082
μέν ἐστιν -0.9116 0 13 0.00313 0.0454
ναός -0.9106 2 12 0.0985 0.354
μάλιστα -0.9079 16 55 0.00938 0.0928
ἔργα -0.9060 2 14 0.0647 0.269
κεῖται -0.9052 4 19 0.0791 0.314
φιλίππου -0.8995 0 12 0.00537 0.0624
πτολεμαῖος -0.8952 0 14 0.00329 0.0464
τὸ ἔργον -0.8765 1 17 0.00523 0.0624
ἦσαν -0.8566 4 24 0.0161 0.132
ἀντιγόνου -0.8538 0 16 0.000983 0.0209
οὕτω -0.8509 12 26 0.511 0.807
ἀνάθημα -0.8416 2 12 0.0985 0.354
ἀνέθεσαν -0.8384 3 22 0.0107 0.101
ναυσὶν -0.8197 5 24 0.0209 0.156
ἀντὶ -0.8096 5 13 0.425 0.731
λίθου -0.8087 2 16 0.0257 0.185