c54f971f526cbbabde242473b54695701542cb0b,allennlp/interpret/attackers/hotflip.py,Hotflip,_first_order_taylor,#Hotflip#,317

Before Change


        function uses the grad, alongside the embedding_matrix to select the token that maximizes the
        first-order taylor approximation of the loss.
        
        grad = torch.from_numpy(grad)
        if token_idx >= self.embedding_matrix.size(0):
            // This happens when we"ve truncated our fake embedding matrix.  We need to do a dot
            // product with the word vector of the current token; if that token is out of
            // vocabulary for our truncated matrix, we need to run it through the embedding layer.
            inputs = self._make_embedder_input([self.vocab.get_token_from_index(token_idx)])
            word_embedding = self.embedding_layer(inputs)[0]
        else:
            word_embedding = torch.nn.functional.embedding(
                torch.LongTensor([token_idx]), self.embedding_matrix
            )
        word_embedding = word_embedding.detach().unsqueeze(0)
        grad = grad.unsqueeze(0).unsqueeze(0)
        // solves equation (3) here https://arxiv.org/abs/1903.06620
        new_embed_dot_grad = torch.einsum("bij,kj->bik", (grad, self.embedding_matrix))
        prev_embed_dot_grad = torch.einsum("bij,bij->bi", (grad, word_embedding)).unsqueeze(-1)
        neg_dir_dot_grad = sign * (prev_embed_dot_grad - new_embed_dot_grad)
        neg_dir_dot_grad = neg_dir_dot_grad.detach().cpu().numpy()
        // Do not replace with non-alphanumeric tokens
        neg_dir_dot_grad[:, :, self.invalid_replacement_indices] = -numpy.inf
        best_at_each_step = neg_dir_dot_grad.argmax(2)

After Change


        function uses the grad, alongside the embedding_matrix to select the token that maximizes the
        first-order taylor approximation of the loss.
        
        grad = util.move_to_device(torch.from_numpy(grad), self.cuda_device)
        if token_idx >= self.embedding_matrix.size(0):
            // This happens when we"ve truncated our fake embedding matrix.  We need to do a dot
            // product with the word vector of the current token; if that token is out of
            // vocabulary for our truncated matrix, we need to run it through the embedding layer.
            inputs = self._make_embedder_input([self.vocab.get_token_from_index(token_idx)])
            word_embedding = self.embedding_layer(inputs)[0]
        else:
            word_embedding = torch.nn.functional.embedding(
                util.move_to_device(torch.LongTensor([token_idx]), self.cuda_device),
                self.embedding_matrix,
            )
        word_embedding = word_embedding.detach().unsqueeze(0)
        grad = grad.unsqueeze(0).unsqueeze(0)
        // solves equation (3) here https://arxiv.org/abs/1903.06620
        new_embed_dot_grad = torch.einsum("bij,kj->bik", (grad, self.embedding_matrix))
        prev_embed_dot_grad = torch.einsum("bij,bij->bi", (grad, word_embedding)).unsqueeze(-1)
        neg_dir_dot_grad = sign * (prev_embed_dot_grad - new_embed_dot_grad)
        neg_dir_dot_grad = neg_dir_dot_grad.detach().cpu().numpy()
        // Do not replace with non-alphanumeric tokens
        neg_dir_dot_grad[:, :, self.invalid_replacement_indices] = -numpy.inf
        best_at_each_step = neg_dir_dot_grad.argmax(2)

In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 2

Instances

Link

Project Name: allenai/allennlp

Commit Name: c54f971f526cbbabde242473b54695701542cb0b

Time: 2020-01-02

Author: udomc.can@gmail.com

File Name: allennlp/interpret/attackers/hotflip.py

Class Name: Hotflip

Method Name: _first_order_taylor

Link

Project Name: maciejkula/spotlight

Commit Name: be426ba9d5f569b5eab685d96bb418d11fbb5474

Time: 2018-05-20

Author: maciej.kula@gmail.com

File Name: tests/test_layers.py

Class Name:

Method Name: test_embeddings