Model¶

main¶

architecture¶

class nlper.model.architecture.BahdanauAttention(hidden_size: int)¶

Bahdanau attention Initializes fully connected layer and internal parameter V with uniformly distributed weights.

Parameters: hidden_size (int) – Number of features attention in fully connected layer

forward(hidden: torch.Tensor, encoder_outputs: torch.Tensor) → torch.Tensor¶

Calculates attention weights by applying softmax on attention alignment scores.

Parameters

hidden (torch.Tensor) – Encoder hidden states
encoder_outputs (torch.Tensor) – Encoder outputs

Returns

Attention weights

Return type

torch.Tensor

score(hidden: torch.Tensor, encoder_outputs: torch.Tensor) → torch.Tensor¶

Calculates alignment scores of attention.

Parameters

hidden (torch.Tensor) – Encoder hidden states
encoder_outputs (torch.Tensor) – Encoder outputs

Returns

Attention alignment scores

Return type

torch.Tensor

class nlper.model.architecture.DecoderRNN(embedding_size: int, hidden_size: int, output_size: int, n_layers: int = 1, dropout: float = 0.1)¶

Model decoder class Initializes embedding layer, dropout layer, Bahdanau attention module, single directional GRU and linear classifier.

Parameters

embedding_size (int) – Size of embedding layer, number of expected features in GRU
hidden_size (int) – Number of features in the hidden state of GRU and in fully connected layer of attention
output_size (int) – Number of unique words in vocabulary
n_layers (int) – Number of recurrent layers in GRU
dropout (float) – Probability of dropout on GRU layer except from last layer

forward(sequence: torch.Tensor, hidden: torch.Tensor, encoder_outputs: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor]¶

Defines decoder structure and flow.

Pushes sequence through embedding layer
Applies dropout
Calls attention layer to obtain attention weights
Calculates context vector of attention
Concatenates context vector with previous decoder output
Feeds GRU with concatenation result
Generate final output by applying softmax

Parameters

sequence (torch.Tensor) – StartOfSentence token or previous decoder output
hidden (torch.Tensor) – Hidden state
encoder_outputs (torch.Tensor) – Encoder output

Returns

Decoder output, decoder hidden state and attention weights

Return type

tuple

class nlper.model.architecture.EncoderRNN(input_size: int, embedding_size: int, hidden_size: int, n_layers: int = 1, dropout: float = 0.1)¶

Model encoder class Initializes embedding laayer and bidirectional GRU.

Parameters

input_size (int) – Number of unique words in vocabulary
embedding_size (int) – Size of embedding layer, number of expected features in GRU
hidden_size (int) – Number of features in the hidden state of GRU
n_layers (int) – Number of recurrent layers in GRU
dropout (float) – Probability of dropout on GRU layer except from last layer

forward(sequence: torch.Tensor, hidden: Any = None) → Tuple[torch.Tensor, torch.Tensor]¶

Defines encoder structure and flow.

Pushes sequence through embedding layer
Feeds GRU with embedded sequence
Merges bidirectional GRU model into single tensor

Parameters

sequence (torch.Tensor) – Tensor of indices representing text
hidden (torch.Tensor, optional) – Initial hidden state of GRU, default None

Returns

Encoder output and encoder hidden states

Return type

tuple

class nlper.model.architecture.Seq2Seq(encoder: torch.nn.modules.module.Module, decoder: torch.nn.modules.module.Module)¶

Sequence to Sequence model, built using encoder and decoder.

Parameters

encoder (nn.Module) – Encoder model
decoder (nn.Module) – Decoder model with Bahdanau attention

forward(text: torch.Tensor, summary: torch.Tensor, teacher_forcing_ratio: float = 0.5) → torch.Tensor¶

Defines Seq2Seq structure and flow. Teacher forcing ratio specifies probability of altering the decoder output with the target summary token for the next word generation. Used to accelerate model learning time.

Feeds encoder with input indices
Initializes decoder hidden state as encoder hidden state
Initializes decoder output with Start of Sequence <sos> token
Initializes summary output vector
Until the maximum summary length is reached:
- Feeds decoder with decoder output, hidden state and encoder output
- Updates decoder output and hidden state
- Updates summary output vector with decoder output token
- With teacher_forcing_ratio probability alters decoder output

Parameters

text (torch.Tensor) – Indices of input text
summary (torch.Tensor) – Indices of target / reference summary
teacher_forcing_ratio (float) –

Returns

Output sequence / summary

Return type

torch.Tensor

model¶

class nlper.model.model.Model(config: Dict[str, Any], vocab_config: Any)¶

Utils of Seq2Seq model.

Executes: * model training * text prediction (summarization) * model evaluation * saving and loading of a model

Parameters

config (dict) – Config dictionary
vocab_config (VocabConfig) – Vocabulary config for model

create_model() → None¶: Initializes full Seq2Seq model with encoder and decoder as specified in config file.

create_optimizers_and_loss() → None¶: Initializes Adam optimizer for Seq2Seq model and learning rate scheduler as specified in config file. Initialized CrossEntropyLoss with ignoring padding token <pad> from sequence.

evaluate(valid_iterator: Any) → List[torch.Tensor]¶

Evaluates the trained Seq2Seq model performance.

Parameters: valid_iterator (torchtext.data.BucketIterator) – Valid or test iterator
Returns

get_text_summary_from_batch(batch) → Tuple[torch.Tensor, torch.Tensor]¶: Obtains original text and target summary indices from batch and transforms to GPU :param batch: :type batch: torchtext.data.batch.Batch :return: Text and summary indices for model :rtype: tuple

load_model(model_path: str, attention_param_path: str = None) → None¶

Loads trained model and transfers to GPU. Currently attention V parameter is also saved and loaded, cause PyTorch does not supports nn.Parameter saving directly.

Parameters

model_path (str) – Path to trained model
attention_param_path (str) – Path to trained attention parameter

predict(text: str, length_of_original_text: float = 0.25) → Tuple[str, torch.Tensor]¶

Predicts model output / summarizes given text. Obtains summarization with defined maximum percentage of length of original text. Returns summarization and attention weights to plot attention heatmap.

Parameters

text (str) – Original text to summarize
length_of_original_text (float) – Maximum ratio of summary length comparing to original text

Returns

summary text and attention weights

Return type

tuple

save_model(model_path: str, model_epoch: int) → None¶

Saves trained model weights after epoch, transferred to CPU. Currently attention V parameter is also saved, cause PyTorch does not supports nn.Parameter saving directly.

Parameters

model_path (str) – Path to save model
model_epoch (int) – Model epoch

show_loss(batch_id: int, loss: torch.Tensor, train_iterator: Any) → None¶

Logs loss value for specified batch.

Parameters

batch_id (int) – Number of batch
loss – Loss value for batch
loss – torch.Tensor
train_iterator (torchtext.data.BucketIterator) – Train iterator

show_rouge_and_attention_matrix(epoch: int, batch_id: int, text: torch.Tensor, summary: torch.Tensor) → None¶

Calls rouge metric calculation and attention heatmap drawing.

Parameters

epoch (int) – Current training epoch.
batch_id (int) – Number of batch
text (torch.Tensor) – Model input / original text indices tensor
summary (torch.Tensor) – Model generated summary text indices tensor

train(train_iterator: Any, epoch: int = 0) → List[torch.Tensor]¶

Executes model training.

Parameters

train_iterator (torchtext.data.BucketIterator) – Iterator over training dataset
epoch (int) – Current training epoch.

Returns

Training loss values for batches

Return type

list

Model¶

main¶

architecture¶

model¶

NLPer

Navigation

Related Topics