Args:
dictionary (~fairseq.data.Dictionary): encoding dictionary
embed_dim(int, optional): embedding dimension
embed_dict(str, optional): filename from which to load pre-trained
embeddings
max_positions(int, optional): maximum supported input sequence length
convolutions(list, optional): the convolutional layer structure. Each
list item `i` corresponds to convolutional layer `i`. Layers are
given as ``(out_channels, kernel_width, [residual])``. Residual