XLM
Excerpt
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
The XLM model was proposed in Cross-lingual Language Model Pretraining by Guillaume Lample, Alexis Conneau. It’s a transformer pretrained using one of the following objectives:
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT’16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT’16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.
class transformers.XLMConfig
( vocab_size = 30145emb_dim = 2048n_layers = 12n_heads = 16dropout = 0.1attention_dropout = 0.1gelu_activation = Truesinusoidal_embeddings = Falsecausal = Falseasm = Falsen_langs = 1use_lang_emb = Truemax_position_embeddings = 512embed_init_std = 0.02209708691207961layer_norm_eps = 1e-12init_std = 0.02bos_index = 0eos_index = 1pad_index = 2unk_index = 3mask_index = 5is_encoder = Truesummary_type = ‘first’summary_use_proj = Truesummary_activation = Nonesummary_proj_to_labels = Truesummary_first_dropout = 0.1start_n_top = 5end_n_top = 5mask_token_id = 0lang_id = 0pad_token_id = 2bos_token_id = 0**kwargs )
This is the configuration class to store the configuration of a XLMModel or a TFXLMModel. It is used to instantiate a XLM model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the FacebookAI/xlm-mlm-en-2048 architecture.
Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.
Examples:
<span>>>> </span><span>from</span> transformers <span>import</span> XLMConfig, XLMModel
<span>>>> </span>
<span>>>> </span>configuration = XLMConfig()
<span>>>> </span>
<span>>>> </span>model = XLMModel(configuration)
<span>>>> </span>
<span>>>> </span>configuration = model.config
class transformers.XLMTokenizer
( vocab_filemerges_fileunk_token = ''sep_token = ''pad_token = '
Construct an XLM tokenizer. Based on Byte-Pair Encoding. The tokenization process is the following:
- Moses preprocessing and tokenization for most supported languages.
- Language specific tokenization for Chinese (Jieba), Japanese (KyTea) and Thai (PyThaiNLP).
- Optionally lowercases and normalizes all inputs text.
- The arguments
special_tokens
and the functionset_special_tokens
, can be used to add additional symbols (like ”classify”) to a vocabulary. - The
lang2id
attribute maps the languages supported by the model with their IDs if provided (automatically set for pretrained vocabularies). - The
id2lang
attributes does reverse mapping if provided (automatically set for pretrained vocabularies).
This tokenizer inherits from PreTrainedTokenizer which contains most of the main methods. Users should refer to this superclass for more information regarding those methods.
build_inputs_with_special_tokens
( token_ids_0: Listtoken_ids_1: Optional = None ) → List[int]
Parameters
- token_ids_0 (
List[int]
) — List of IDs to which the special tokens will be added. - token_ids_1 (
List[int]
, optional) — Optional second list of IDs for sequence pairs.
List of input IDs with the appropriate special tokens.
Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and adding special tokens. An XLM sequence has the following format:
- single sequence:
<s> X </s>
- pair of sequences:
<s> A </s> B </s>
get_special_tokens_mask
( token_ids_0: Listtoken_ids_1: Optional = Nonealready_has_special_tokens: bool = False ) → List[int]
Parameters
- token_ids_0 (
List[int]
) — List of IDs. - token_ids_1 (
List[int]
, optional) — Optional second list of IDs for sequence pairs. - already_has_special_tokens (
bool
, optional, defaults toFalse
) — Whether or not the token list is already formatted with special tokens for the model.
A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding special tokens using the tokenizer prepare_for_model
method.
create_token_type_ids_from_sequences
( token_ids_0: Listtoken_ids_1: Optional = None ) → List[int]
Parameters
- token_ids_0 (
List[int]
) — List of IDs. - token_ids_1 (
List[int]
, optional) — Optional second list of IDs for sequence pairs.
List of token type IDs according to the given sequence(s).
Create a mask from the two sequences passed to be used in a sequence-pair classification task. An XLM sequence
pair mask has the following format:
0<span> 0 </span>0<span> 0 </span>0<span> 0 </span>0<span> 0 </span>0<span> 0 </span>0<span> 1 </span>1<span> 1 </span>1<span> 1 </span>1<span> 1 </span>1 1
| first sequence | second sequence |
If token_ids_1
is None
, this method only returns the first portion of the mask (0s).
save_vocabulary
( save_directory: strfilename_prefix: Optional = None )
class transformers.models.xlm.modeling_xlm.XLMForQuestionAnsweringOutput
( loss: Optional = Nonestart_top_log_probs: Optional = Nonestart_top_index: Optional = Noneend_top_log_probs: Optional = Noneend_top_index: Optional = Nonecls_logits: Optional = Nonehidden_states: Optional = Noneattentions: Optional = None )
Base class for outputs of question answering models using a SquadHead
.
Pytorch
Hide Pytorch content
XLMModel
class transformers.XLMModel
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
The bare XLM Model transformer outputting raw hidden-states without any specific head on top.
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.BaseModelOutput or tuple(torch.FloatTensor)
The XLMModel forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMModel
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMModel.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"pt"</span>)
<span>>>> </span>outputs = model(**inputs)
<span>>>> </span>last_hidden_states = outputs.last_hidden_state
XLMWithLMHeadModel
class transformers.XLMWithLMHeadModel
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
The XLM Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonelabels: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.MaskedLMOutput or tuple(torch.FloatTensor)
The XLMWithLMHeadModel forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMWithLMHeadModel
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMWithLMHeadModel.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"The capital of France is <special1>."</span>, return_tensors=<span>"pt"</span>)
<span>>>> </span><span>with</span> torch.no_grad():
<span>... </span> logits = model(**inputs).logits
<span>>>> </span>
<span>>>> </span>mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[<span>0</span>].nonzero(as_tuple=<span>True</span>)[<span>0</span>]
<span>>>> </span>predicted_token_id = logits[<span>0</span>, mask_token_index].argmax(axis=-<span>1</span>)
<span>>>> </span>labels = tokenizer(<span>"The capital of France is Paris."</span>, return_tensors=<span>"pt"</span>)[<span>"input_ids"</span>]
<span>>>> </span>
<span>>>> </span>labels = torch.where(inputs.input_ids == tokenizer.mask_token_id, labels, -<span>100</span>)
<span>>>> </span>outputs = model(**inputs, labels=labels)
XLMForSequenceClassification
class transformers.XLMForSequenceClassification
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks.
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonelabels: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.SequenceClassifierOutput or tuple(torch.FloatTensor)
The XLMForSequenceClassification forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example of single-label classification:
<span>>>> </span><span>import</span> torch
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForSequenceClassification
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForSequenceClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"pt"</span>)
<span>>>> </span><span>with</span> torch.no_grad():
<span>... </span> logits = model(**inputs).logits
<span>>>> </span>predicted_class_id = logits.argmax().item()
<span>>>> </span>
<span>>>> </span>num_labels = <span>len</span>(model.config.id2label)
<span>>>> </span>model = XLMForSequenceClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>, num_labels=num_labels)
<span>>>> </span>labels = torch.tensor([<span>1</span>])
<span>>>> </span>loss = model(**inputs, labels=labels).loss
Example of multi-label classification:
<span>>>> </span><span>import</span> torch
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForSequenceClassification
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForSequenceClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>, problem_type=<span>"multi_label_classification"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"pt"</span>)
<span>>>> </span><span>with</span> torch.no_grad():
<span>... </span> logits = model(**inputs).logits
<span>>>> </span>predicted_class_ids = torch.arange(<span>0</span>, logits.shape[-<span>1</span>])[torch.sigmoid(logits).squeeze(dim=<span>0</span>) > <span>0.5</span>]
<span>>>> </span>
<span>>>> </span>num_labels = <span>len</span>(model.config.id2label)
<span>>>> </span>model = XLMForSequenceClassification.from_pretrained(
<span>... </span> <span>"FacebookAI/xlm-mlm-en-2048"</span>, num_labels=num_labels, problem_type=<span>"multi_label_classification"</span>
<span>... </span>)
<span>>>> </span>labels = torch.<span>sum</span>(
<span>... </span> torch.nn.functional.one_hot(predicted_class_ids[<span>None</span>, :].clone(), num_classes=num_labels), dim=<span>1</span>
<span>... </span>).to(torch.<span>float</span>)
<span>>>> </span>loss = model(**inputs, labels=labels).loss
XLMForMultipleChoice
class transformers.XLMForMultipleChoice
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a multiple choice classification head on top (a linear layer on top of the pooled output and a softmax) e.g. for RocStories/SWAG tasks.
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonelabels: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.MultipleChoiceModelOutput or tuple(torch.FloatTensor)
The XLMForMultipleChoice forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForMultipleChoice
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForMultipleChoice.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>prompt = <span>"In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."</span>
<span>>>> </span>choice0 = <span>"It is eaten with a fork and a knife."</span>
<span>>>> </span>choice1 = <span>"It is eaten while held in the hand."</span>
<span>>>> </span>labels = torch.tensor(<span>0</span>).unsqueeze(<span>0</span>)
<span>>>> </span>encoding = tokenizer([prompt, prompt], [choice0, choice1], return_tensors=<span>"pt"</span>, padding=<span>True</span>)
<span>>>> </span>outputs = model(**{k: v.unsqueeze(<span>0</span>) <span>for</span> k, v <span>in</span> encoding.items()}, labels=labels)
<span>>>> </span>
<span>>>> </span>loss = outputs.loss
<span>>>> </span>logits = outputs.logits
XLMForTokenClassification
class transformers.XLMForTokenClassification
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonelabels: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.TokenClassifierOutput or tuple(torch.FloatTensor)
The XLMForTokenClassification forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForTokenClassification
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForTokenClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(
<span>... </span> <span>"HuggingFace is a company based in Paris and New York"</span>, add_special_tokens=<span>False</span>, return_tensors=<span>"pt"</span>
<span>... </span>)
<span>>>> </span><span>with</span> torch.no_grad():
<span>... </span> logits = model(**inputs).logits
<span>>>> </span>predicted_token_class_ids = logits.argmax(-<span>1</span>)
<span>>>> </span>
<span>>>> </span>
<span>>>> </span>
<span>>>> </span>predicted_tokens_classes = [model.config.id2label[t.item()] <span>for</span> t <span>in</span> predicted_token_class_ids[<span>0</span>]]
<span>>>> </span>labels = predicted_token_class_ids
<span>>>> </span>loss = model(**inputs, labels=labels).loss
XLMForQuestionAnsweringSimple
class transformers.XLMForQuestionAnsweringSimple
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits
and span end logits
).
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonestart_positions: Optional = Noneend_positions: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.modeling_outputs.QuestionAnsweringModelOutput or tuple(torch.FloatTensor)
The XLMForQuestionAnsweringSimple forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForQuestionAnsweringSimple
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForQuestionAnsweringSimple.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>question, text = <span>"Who was Jim Henson?"</span>, <span>"Jim Henson was a nice puppet"</span>
<span>>>> </span>inputs = tokenizer(question, text, return_tensors=<span>"pt"</span>)
<span>>>> </span><span>with</span> torch.no_grad():
<span>... </span> outputs = model(**inputs)
<span>>>> </span>answer_start_index = outputs.start_logits.argmax()
<span>>>> </span>answer_end_index = outputs.end_logits.argmax()
<span>>>> </span>predict_answer_tokens = inputs.input_ids[<span>0</span>, answer_start_index : answer_end_index + <span>1</span>]
<span>>>> </span>
<span>>>> </span>target_start_index = torch.tensor([<span>14</span>])
<span>>>> </span>target_end_index = torch.tensor([<span>15</span>])
<span>>>> </span>outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index)
<span>>>> </span>loss = outputs.loss
XLMForQuestionAnswering
class transformers.XLMForQuestionAnswering
( config )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a beam-search span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits
and span end logits
).
This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
forward
( input_ids: Optional = Noneattention_mask: Optional = Nonelangs: Optional = Nonetoken_type_ids: Optional = Noneposition_ids: Optional = Nonelengths: Optional = Nonecache: Optional = Nonehead_mask: Optional = Noneinputs_embeds: Optional = Nonestart_positions: Optional = Noneend_positions: Optional = Noneis_impossible: Optional = Nonecls_index: Optional = Nonep_mask: Optional = Noneoutput_attentions: Optional = Noneoutput_hidden_states: Optional = Nonereturn_dict: Optional = None ) → transformers.models.xlm.modeling_xlm.XLMForQuestionAnsweringOutput or tuple(torch.FloatTensor)
The XLMForQuestionAnswering forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, XLMForQuestionAnswering
<span>>>> </span><span>import</span> torch
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = XLMForQuestionAnswering.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>input_ids = torch.tensor(tokenizer.encode(<span>"Hello, my dog is cute"</span>, add_special_tokens=<span>True</span>)).unsqueeze(
<span>... </span> <span>0</span>
<span>... </span>)
<span>>>> </span>start_positions = torch.tensor([<span>1</span>])
<span>>>> </span>end_positions = torch.tensor([<span>3</span>])
<span>>>> </span>outputs = model(input_ids, start_positions=start_positions, end_positions=end_positions)
<span>>>> </span>loss = outputs.loss
TensorFlow
Hide TensorFlow content
TFXLMModel
class transformers.TFXLMModel
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
The bare XLM Model transformer outputting raw hidden-states without any specific head on top.
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: tf.Tensor | None = Nonelangs: tf.Tensor | None = Nonetoken_type_ids: tf.Tensor | None = Noneposition_ids: tf.Tensor | None = Nonelengths: tf.Tensor | None = Nonecache: Dict[str, tf.Tensor] | None = Nonehead_mask: tf.Tensor | None = Noneinputs_embeds: tf.Tensor | None = Noneoutput_attentions: bool | None = Noneoutput_hidden_states: bool | None = Nonereturn_dict: bool | None = Nonetraining: bool = False ) → transformers.modeling_tf_outputs.TFBaseModelOutput or tuple(tf.Tensor)
The TFXLMModel forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMModel
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMModel.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"tf"</span>)
<span>>>> </span>outputs = model(inputs)
<span>>>> </span>last_hidden_states = outputs.last_hidden_state
TFXLMWithLMHeadModel
class transformers.TFXLMWithLMHeadModel
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
The XLM Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: np.ndarray | tf.Tensor | None = Nonelangs: np.ndarray | tf.Tensor | None = Nonetoken_type_ids: np.ndarray | tf.Tensor | None = Noneposition_ids: np.ndarray | tf.Tensor | None = Nonelengths: np.ndarray | tf.Tensor | None = Nonecache: Optional[Dict[str, tf.Tensor]] = Nonehead_mask: np.ndarray | tf.Tensor | None = Noneinputs_embeds: np.ndarray | tf.Tensor | None = Noneoutput_attentions: Optional[bool] = Noneoutput_hidden_states: Optional[bool] = Nonereturn_dict: Optional[bool] = Nonetraining: bool = False ) → transformers.models.xlm.modeling_tf_xlm.TFXLMWithLMHeadModelOutput
or tuple(tf.Tensor)
The TFXLMWithLMHeadModel forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMWithLMHeadModel
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMWithLMHeadModel.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"tf"</span>)
<span>>>> </span>outputs = model(inputs)
<span>>>> </span>logits = outputs.logits
TFXLMForSequenceClassification
class transformers.TFXLMForSequenceClassification
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks.
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: np.ndarray | tf.Tensor | None = Nonelangs: np.ndarray | tf.Tensor | None = Nonetoken_type_ids: np.ndarray | tf.Tensor | None = Noneposition_ids: np.ndarray | tf.Tensor | None = Nonelengths: np.ndarray | tf.Tensor | None = Nonecache: Optional[Dict[str, tf.Tensor]] = Nonehead_mask: np.ndarray | tf.Tensor | None = Noneinputs_embeds: np.ndarray | tf.Tensor | None = Noneoutput_attentions: Optional[bool] = Noneoutput_hidden_states: Optional[bool] = Nonereturn_dict: Optional[bool] = Nonelabels: np.ndarray | tf.Tensor | None = Nonetraining: bool = False ) → transformers.modeling_tf_outputs.TFSequenceClassifierOutput or tuple(tf.Tensor)
The TFXLMForSequenceClassification forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMForSequenceClassification
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMForSequenceClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(<span>"Hello, my dog is cute"</span>, return_tensors=<span>"tf"</span>)
<span>>>> </span>logits = model(**inputs).logits
<span>>>> </span>predicted_class_id = <span>int</span>(tf.math.argmax(logits, axis=-<span>1</span>)[<span>0</span>])
<span>>>> </span>
<span>>>> </span>num_labels = <span>len</span>(model.config.id2label)
<span>>>> </span>model = TFXLMForSequenceClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>, num_labels=num_labels)
<span>>>> </span>labels = tf.constant(<span>1</span>)
<span>>>> </span>loss = model(**inputs, labels=labels).loss
TFXLMForMultipleChoice
class transformers.TFXLMForMultipleChoice
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a multiple choice classification head on top (a linear layer on top of the pooled output and a softmax) e.g. for RocStories/SWAG tasks.
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: np.ndarray | tf.Tensor | None = Nonelangs: np.ndarray | tf.Tensor | None = Nonetoken_type_ids: np.ndarray | tf.Tensor | None = Noneposition_ids: np.ndarray | tf.Tensor | None = Nonelengths: np.ndarray | tf.Tensor | None = Nonecache: Optional[Dict[str, tf.Tensor]] = Nonehead_mask: np.ndarray | tf.Tensor | None = Noneinputs_embeds: np.ndarray | tf.Tensor | None = Noneoutput_attentions: Optional[bool] = Noneoutput_hidden_states: Optional[bool] = Nonereturn_dict: Optional[bool] = Nonelabels: np.ndarray | tf.Tensor | None = Nonetraining: bool = False ) → transformers.modeling_tf_outputs.TFMultipleChoiceModelOutput or tuple(tf.Tensor)
The TFXLMForMultipleChoice forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMForMultipleChoice
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMForMultipleChoice.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>prompt = <span>"In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."</span>
<span>>>> </span>choice0 = <span>"It is eaten with a fork and a knife."</span>
<span>>>> </span>choice1 = <span>"It is eaten while held in the hand."</span>
<span>>>> </span>encoding = tokenizer([prompt, prompt], [choice0, choice1], return_tensors=<span>"tf"</span>, padding=<span>True</span>)
<span>>>> </span>inputs = {k: tf.expand_dims(v, <span>0</span>) <span>for</span> k, v <span>in</span> encoding.items()}
<span>>>> </span>outputs = model(inputs)
<span>>>> </span>
<span>>>> </span>logits = outputs.logits
TFXLMForTokenClassification
class transformers.TFXLMForTokenClassification
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: np.ndarray | tf.Tensor | None = Nonelangs: np.ndarray | tf.Tensor | None = Nonetoken_type_ids: np.ndarray | tf.Tensor | None = Noneposition_ids: np.ndarray | tf.Tensor | None = Nonelengths: np.ndarray | tf.Tensor | None = Nonecache: Optional[Dict[str, tf.Tensor]] = Nonehead_mask: np.ndarray | tf.Tensor | None = Noneinputs_embeds: np.ndarray | tf.Tensor | None = Noneoutput_attentions: Optional[bool] = Noneoutput_hidden_states: Optional[bool] = Nonereturn_dict: Optional[bool] = Nonelabels: np.ndarray | tf.Tensor | None = Nonetraining: bool = False ) → transformers.modeling_tf_outputs.TFTokenClassifierOutput or tuple(tf.Tensor)
The TFXLMForTokenClassification forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMForTokenClassification
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMForTokenClassification.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>inputs = tokenizer(
<span>... </span> <span>"HuggingFace is a company based in Paris and New York"</span>, add_special_tokens=<span>False</span>, return_tensors=<span>"tf"</span>
<span>... </span>)
<span>>>> </span>logits = model(**inputs).logits
<span>>>> </span>predicted_token_class_ids = tf.math.argmax(logits, axis=-<span>1</span>)
<span>>>> </span>
<span>>>> </span>
<span>>>> </span>
<span>>>> </span>predicted_tokens_classes = [model.config.id2label[t] <span>for</span> t <span>in</span> predicted_token_class_ids[<span>0</span>].numpy().tolist()]
<span>>>> </span>labels = predicted_token_class_ids
<span>>>> </span>loss = tf.math.reduce_mean(model(**inputs, labels=labels).loss)
TFXLMForQuestionAnsweringSimple
class transformers.TFXLMForQuestionAnsweringSimple
( config*inputs**kwargs )
Parameters
- config (XLMConfig) — Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.
XLM Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layer on top of the hidden-states output to compute span start logits
and span end logits
).
This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)
This model is also a keras.Model subclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.
TensorFlow models and layers in transformers
accept two formats as input:
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional argument.
The reason the second format is supported is that Keras methods prefer this format when passing inputs to models and layers. Because of this support, when using methods like model.fit()
things should “just work” for you - just pass your inputs and labels in any format that model.fit()
supports! If, however, you want to use the second format outside of Keras methods like fit()
and predict()
, such as when creating your own layers or models with the Keras Functional
API, there are three possibilities you can use to gather all the input Tensors in the first positional argument:
- a single Tensor with
input_ids
only and nothing else:model(input_ids)
- a list of varying length with one or several input Tensors IN THE ORDER given in the docstring:
model([input_ids, attention_mask])
ormodel([input_ids, attention_mask, token_type_ids])
- a dictionary with one or several input Tensors associated to the input names given in the docstring:
model({"input_ids": input_ids, "token_type_ids": token_type_ids})
Note that when creating models and layers with subclassing then you don’t need to worry about any of this, as you can just pass inputs like you would to any other Python function!
call
( input_ids: TFModelInputType | None = Noneattention_mask: np.ndarray | tf.Tensor | None = Nonelangs: np.ndarray | tf.Tensor | None = Nonetoken_type_ids: np.ndarray | tf.Tensor | None = Noneposition_ids: np.ndarray | tf.Tensor | None = Nonelengths: np.ndarray | tf.Tensor | None = Nonecache: Optional[Dict[str, tf.Tensor]] = Nonehead_mask: np.ndarray | tf.Tensor | None = Noneinputs_embeds: np.ndarray | tf.Tensor | None = Noneoutput_attentions: Optional[bool] = Noneoutput_hidden_states: Optional[bool] = Nonereturn_dict: Optional[bool] = Nonestart_positions: np.ndarray | tf.Tensor | None = Noneend_positions: np.ndarray | tf.Tensor | None = Nonetraining: bool = False ) → transformers.modeling_tf_outputs.TFQuestionAnsweringModelOutput or tuple(tf.Tensor)
The TFXLMForQuestionAnsweringSimple forward method, overrides the __call__
special method.
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
Example:
<span>>>> </span><span>from</span> transformers <span>import</span> AutoTokenizer, TFXLMForQuestionAnsweringSimple
<span>>>> </span><span>import</span> tensorflow <span>as</span> tf
<span>>>> </span>tokenizer = AutoTokenizer.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>model = TFXLMForQuestionAnsweringSimple.from_pretrained(<span>"FacebookAI/xlm-mlm-en-2048"</span>)
<span>>>> </span>question, text = <span>"Who was Jim Henson?"</span>, <span>"Jim Henson was a nice puppet"</span>
<span>>>> </span>inputs = tokenizer(question, text, return_tensors=<span>"tf"</span>)
<span>>>> </span>outputs = model(**inputs)
<span>>>> </span>answer_start_index = <span>int</span>(tf.math.argmax(outputs.start_logits, axis=-<span>1</span>)[<span>0</span>])
<span>>>> </span>answer_end_index = <span>int</span>(tf.math.argmax(outputs.end_logits, axis=-<span>1</span>)[<span>0</span>])
<span>>>> </span>predict_answer_tokens = inputs.input_ids[<span>0</span>, answer_start_index : answer_end_index + <span>1</span>]
<span>>>> </span>
<span>>>> </span>target_start_index = tf.constant([<span>14</span>])
<span>>>> </span>target_end_index = tf.constant([<span>15</span>])
<span>>>> </span>outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index)
<span>>>> </span>loss = tf.math.reduce_mean(outputs.loss)