926c9ea2e480693701aaa0d61e85c67dbba672cb,lexos/processors/analyze/topword.py,,_z_test_word_list_,#,47

Before Change


             maps a word to its z-score.
    
    // initialize
    word_z_score_dict = {}
    row_sum = np.sum(count_list_i).item()
    total_sum = np.sum(count_list_j).item()

    // analyze

After Change


             maps a word to its z-score.
    
    // initialize
    word_z_score_list = []
    row_sum = np.sum(count_list_i).item()
    total_sum = np.sum(count_list_j).item()

    // analyze
    for index, word in enumerate(words):
        p_i = count_list_i[index] / row_sum
        p_j = count_list_j[index] / total_sum
        z_score = _z_test_(p1=p_i, pt=p_j, n1=row_sum, nt=total_sum)
        // get rid of the insignificant results
        // insignificant means those with absolute values smaller than 1.96
        if abs(z_score) >= 1.96:
            word_z_score_list.append((word, z_score))

    // sort the dictionary by the z-scores from larger to smaller
    sorted_word_z_score_list = sorted(word_z_score_list,
                                      key=lambda tup: abs(tup[1]),
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 4

Non-data size: 6

Instances


Project Name: WheatonCS/Lexos
Commit Name: 926c9ea2e480693701aaa0d61e85c67dbba672cb
Time: 2017-08-14
Author: weltch1997@gmail.com
File Name: lexos/processors/analyze/topword.py
Class Name:
Method Name: _z_test_word_list_


Project Name: UFAL-DSG/tgen
Commit Name: a6ca04e3c919ec21de59c915b40d027f7f9dad4c
Time: 2014-08-25
Author: odusek@ufal.mff.cuni.cz
File Name: tgen/features.py
Class Name: Features
Method Name: parse_feature_spec


Project Name: nilmtk/nilmtk
Commit Name: 63ef8bfa3b63090437bd27bf1e04db7ae1eae3cf
Time: 2014-11-27
Author: jack-list@xlk.org.uk
File Name: nilmtk/stats/totalenergy.py
Class Name:
Method Name: get_total_energy


Project Name: SheffieldML/GPy
Commit Name: 57c4306d9282b8d3fc815336c86c7af64f75a756
Time: 2015-10-03
Author: ibinbei@gmail.com
File Name: GPy/plotting/gpy_plot/gp_plots.py
Class Name:
Method Name: _plot_confidence