The objective of this analysis is to compare the performance of classifier trained to identify accountability specific to an event, vs a classifier trained to identify accountability in general. If there are common features across all events that indicate accountability, then the performance when trained on all the news events should increase (typically more data improves performance).

However, if the performance decreases when the classifier is trained on multiple datasets, this means that it is likely there are not a prominent features that capture the meaning of accountability in general. This could indicate that the annotations of accountability are event specific, or there is not enough data from a variety of different events to capture the generalized representation of accountability.

The results shown in this post also compare sentence vs excerpt level classifiers, and a comparison of different representation and classification algorithms.

Summary of Findings

The main observation, is that there is a wide range of performance results, with some events achieving performance in fscore above 0.8, while some are as low as ~0.5. Also, note that the inter-annotator agreement for the events ranges from 0.6-0.8.

The effect of transitioning from excerpt level to sentence level also decreases performance, but not by as much as the effect of the event.

An additional finding is that the character based representation, and the SVM classifier had the best performance out of the methods tested in this analysis, the the difference between performance in the linear classifiers is almost across all the variations tested is almost negligeable.

The are summarized in tables in the following sections.

Individual Events

Sentence Based

count mean std min 25% 50% 75% max
event
Charleston 108.0 0.232965 0.141681 0.000000 0.120760 0.245000 0.357264 0.462500
Isla Vista 108.0 0.738317 0.024722 0.693498 0.721297 0.738237 0.750443 0.802410
Marysville 108.0 0.710327 0.028182 0.654321 0.687905 0.710819 0.733473 0.761905
Newtown 108.0 0.363988 0.127713 0.152091 0.245902 0.397473 0.475519 0.560870
Orlando 108.0 0.270458 0.157829 0.000000 0.138889 0.278532 0.418455 0.476190
San Bernardino 108.0 0.321567 0.116414 0.096386 0.235294 0.336304 0.421232 0.522293
Vegas 108.0 0.141236 0.115760 0.000000 0.000000 0.121212 0.250880 0.380952

Excerpts

count mean std min 25% 50% 75% max
event
Charleston 108.0 0.323976 0.141580 0.050000 0.215686 0.338462 0.448497 0.568182
Isla Vista 108.0 0.757786 0.022385 0.722045 0.740443 0.754337 0.777850 0.813754
Marysville 108.0 0.762653 0.060974 0.649351 0.717634 0.768177 0.810127 0.882353
Newtown 108.0 0.413744 0.164910 0.067797 0.337558 0.476467 0.522574 0.599156
Orlando 108.0 0.237244 0.146436 0.000000 0.117647 0.288018 0.354430 0.487805
San Bernardino 108.0 0.412743 0.130067 0.121212 0.333333 0.448881 0.500000 0.615385
Vegas 108.0 0.143728 0.152868 0.000000 0.000000 0.080000 0.285714 0.518519

Combined Datasets

Sentences

count mean std min 25% 50% 75% max
classifier
logregcv 27.0 0.538240 0.026181 0.502447 0.521963 0.528771 0.563489 0.583333
logregcv_balanced 27.0 0.560509 0.032042 0.504496 0.532829 0.565003 0.591144 0.606166
random_forest_balanced 27.0 0.506861 0.008723 0.489726 0.502209 0.505082 0.509686 0.532418
svm_balanced 27.0 0.552539 0.026245 0.511447 0.536557 0.550852 0.573312 0.607453
count mean std min 25% 50% 75% max
vectorizer
1gram 12.0 0.523218 0.014292 0.502758 0.512716 0.524475 0.531200 0.546624
3gram 12.0 0.551585 0.032038 0.504334 0.528432 0.556689 0.574398 0.593997
char 12.0 0.570441 0.031584 0.518699 0.548608 0.579456 0.590890 0.607453
cust_all-1gram 12.0 0.517531 0.016412 0.496622 0.502655 0.517321 0.532072 0.542048
cust_all-3gram 12.0 0.548571 0.034563 0.498834 0.518188 0.558592 0.578283 0.590374
cust_no_nums-1gram 12.0 0.519189 0.017042 0.489726 0.505752 0.520005 0.534048 0.540292
cust_no_nums-3gram 12.0 0.550599 0.033735 0.505082 0.518338 0.557842 0.578534 0.593817
cust_only_alpha-1gram 12.0 0.518241 0.013186 0.501186 0.507472 0.517567 0.526971 0.537879
cust_only_alpha-3gram 12.0 0.556459 0.034341 0.506245 0.526145 0.568348 0.578001 0.602856

Excerpts

count mean std min 25% 50% 75% max
classifier
logregcv 27.0 0.606388 0.029977 0.548485 0.592954 0.600801 0.624813 0.658854
logregcv_balanced 27.0 0.616874 0.023957 0.576471 0.600716 0.612943 0.629839 0.661818
random_forest_balanced 27.0 0.470446 0.021622 0.440141 0.454623 0.464883 0.489271 0.507993
svm_balanced 27.0 0.614373 0.028834 0.566914 0.590567 0.618690 0.629487 0.670190
count mean std min 25% 50% 75% max
vectorizer
1gram 12.0 0.556811 0.060908 0.440141 0.540708 0.577642 0.601289 0.606838
3gram 12.0 0.587291 0.075564 0.451049 0.563028 0.620402 0.637051 0.658854
char 12.0 0.600079 0.070327 0.469178 0.579143 0.620645 0.648738 0.670190
cust_all-1gram 12.0 0.560850 0.062405 0.440141 0.547690 0.587992 0.597148 0.623542
cust_all-3gram 12.0 0.589742 0.072032 0.457539 0.576271 0.618028 0.633470 0.653504
cust_no_nums-1gram 12.0 0.555949 0.060072 0.445614 0.536471 0.584496 0.596083 0.604692
cust_no_nums-3gram 12.0 0.589490 0.070813 0.457539 0.577209 0.620488 0.632463 0.648438
cust_only_alpha-1gram 12.0 0.557290 0.060581 0.440141 0.535464 0.579711 0.595053 0.622642
cust_only_alpha-3gram 12.0 0.595678 0.071185 0.459413 0.578810 0.623086 0.639356 0.661433