d7bcbade517750cc4fdb7642415e4cf05ae584e8,tutorial/utils.py,,collect_pubtator_annotations,#,8

Before Change


    Given a list of ddlite Documents with PubTator/CDR annotations,
    extract a dictionary of annotations by type.
    
    annotations = defaultdict(list)
    for a in doc.attribs["root"].xpath(".//annotation"):

        // Relation annotations
        if len(a.xpath("./infon[@key="relation"]")) > 0:
            type = a.xpath("./infon[@key="relation"]/text()")[0]
            types = a.xpath("./infon[@key != "relation"]/@key")
            mesh_ids = a.xpath("./infon[@key != "relation"]/text()")
            annotations[type].append(PubtatorRelation(types=types, mesh_ids=mesh_ids))

        // Mention annotations
        else:
            txt = a.xpath("./text/text()")[0]
            offset = int(a.xpath("./location/@offset")[0])
            length = int(a.xpath("./location/@length")[0])
            type = a.xpath("./infon[@key="type"]/text()")[0]
            mesh = a.xpath("./infon[@key="MESH"]/text()")[0]
            annotations[type].append(PubtatorMention(mesh_id=mesh, text=txt,
                                                     char_offset=offset, char_length=length))
    return annotations

After Change


    sent_offsets = [ensure_dict(s)[CHAR_OFFSETS][0] for s in sents]

    // Get Ngrams
    ngrams = []
    for a in doc.attribs["root"].xpath(".//annotation"):

        // Relation annotations
        if len(a.xpath("./infon[@key="relation"]")) > 0:

In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 5

Instances

Link

Project Name: snorkel-team/snorkel

Commit Name: d7bcbade517750cc4fdb7642415e4cf05ae584e8

Time: 2016-07-03

Author: ajratner@gmail.com

File Name: tutorial/utils.py

Class Name:

Method Name: collect_pubtator_annotations

Link

Project Name: jbms/beancount-import

Commit Name: bc19ed6b434f77e5758f4baf9cb23ed5d2b25a72

Time: 2020-08-07

Author: dumbpyx@gmail.com

File Name: beancount_import/source/generic_importer_source.py

Class Name: ImporterSource

Method Name: prepare

Link

Project Name: jbms/beancount-import

Commit Name: 7450bce543a9ac268b6bc5e97b557c9596823aed

Time: 2020-08-07

Author: dumbpyx@gmail.com

File Name: beancount_import/source/generic_importer_source.py

Class Name: ImporterSource

Method Name: prepare