Pausanias Analysis

Greta sentence-level mythic vs. historical analysis

Surface, including books 4 and 8, without rhetoric markers

This model drops sentences tagged other and fits a balanced TF-IDF logistic regression to the remaining mythic and historical tags.

Model Performance Metrics

The following metrics are from the logistic regression classifier's performance on the test set:

Class Precision Recall F1-Score Support
Historical 0.606 0.677 0.640 248
Mythic 0.752 0.689 0.719 351
Overall Accuracy 0.684 599

Confusion Matrix

Predicted
Historical Mythic
Actual Historical 168 80
Mythic 109 242

Counts

2,393mythic/historical sentences
1,404mythic
989historical
1,000features

Predictors of Mythic Sentences

Sort by:
Word/Phrase English Coefficient Mythic Count Historical Count p-value q-value
εἶναι 2.4335 168 36 4.29e-14 4.29e-11
φασιν 1.6178 70 9 2.53e-09 1.26e-06
φασὶν 1.4880 43 5 1.62e-06 0.000324
ἐστιν 1.4847 122 39 2.52e-06 0.00036
πρῶτον 1.3351 44 11 0.000673 0.0217
αὐτοῦ 1.2896 45 16 0.0127 0.121
ἐστὶν 1.2733 55 20 0.00718 0.0909
δʼ 1.2565 32 4 0.00011 0.0058
παῖδασ 1.2266 44 9 0.000118 0.00588
βωμὸσ 1.2162 17 0 0.000196 0.00934
διὰ 1.1972 48 19 0.0254 0.167
ἔπη 1.1725 16 0 0.000374 0.0139
ἱδρύσατο 1.1240 10 0 0.00694 0.089
ὅτι 1.0875 51 16 0.00236 0.0437
ἔστι 1.0447 48 15 0.00309 0.0551
θυγάτηρ 1.0331 7 2 0.322 0.62
αὐτήν 1.0243 10 2 0.139 0.407
αὐτῶι 1.0207 65 22 0.00144 0.0343
ποιῆσαι 1.0144 20 3 0.00498 0.0743
ὄντα 0.9932 29 9 0.0212 0.163
ἀρχαῖον 0.9702 18 3 0.0128 0.121
λόγον 0.9559 16 3 0.0325 0.198
αὐτῆι 0.9518 21 5 0.0157 0.139
δὲ εἶναι 0.9447 24 1 4.17e-05 0.00261
καὶ ὅτι 0.9389 13 0 0.00115 0.0292
ὄνομα 0.9386 53 14 0.000327 0.0136
παισὶν 0.9180 7 0 0.0462 0.237
καὶ εἶναι 0.8905 12 0 0.00207 0.0397
αὐτὴν 0.8843 28 18 0.759 0.903
φασι 0.8828 21 4 0.0126 0.121

Predictors of Historical Sentences

Sort by:
Word/Phrase English Coefficient Mythic Count Historical Count p-value q-value
πολέμωι -1.5708 4 22 7.39e-06 0.000821
πόλεμον -1.5219 6 28 7.3e-07 0.000183
τούτουσ -1.5103 6 13 0.0168 0.142
πολέμου -1.4840 7 24 3.78e-05 0.0026
ἄνδρασ -1.4514 5 16 0.0011 0.029
οἰκίασ -1.4070 2 16 3.9e-05 0.0026
καὶ ἐσ -1.2903 38 55 0.000426 0.015
ἐναντία -1.2351 0 13 9.8e-06 0.00098
ναυσὶν -1.2238 3 22 3.13e-06 0.000391
στρατιᾶι -1.2070 1 14 3.72e-05 0.0026
μὴ -1.1549 19 25 0.037 0.216
αὐτὸσ -1.1417 19 34 0.00072 0.0225
βασιλέων -1.1372 0 12 2.39e-05 0.002
πολλά -1.0945 0 6 0.00494 0.0743
κόσμον -1.0550 0 7 0.00203 0.0397
ποτε -1.0288 12 14 0.197 0.484
σφῶν -1.0239 0 9 0.000344 0.0138
ὅσον -1.0097 6 15 0.00508 0.0748
χαλκοῦν -1.0080 2 9 0.0103 0.116
καὶ ἔτι -0.9895 0 8 0.000837 0.0239
ἤδη -0.9722 23 31 0.0163 0.141
ὃσ -0.9651 23 39 0.000534 0.0178
οὖν -0.9432 34 40 0.0252 0.167
κοινῶι -0.9313 2 7 0.0383 0.216
πρώτοισ -0.9288 3 6 0.175 0.435
πλέον -0.9257 5 14 0.0041 0.0695
οὐ -0.9223 86 76 0.137 0.407
δὲ τοῦτον -0.9009 2 4 0.238 0.533
τὰ -0.8938 120 154 1.37e-07 4.58e-05
χρήματα -0.8842 0 11 5.81e-05 0.00342