Compartilhamento de tecnologia

240711_Shengsi Learning Check-in-Day23-LSTM Anotação de sequência CRF (2)

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

240711_Shengsi Learning Check-in-Day23-LSTM+anotação de sequência CRF (2)

Hoje gravamos a segunda parte da anotação da sequência LSTM+CRF.Apenas um simples registro

Cálculo de pontuação

首先计算正确标签序列所对应的得分,这里需要注意,除了转移概率矩阵𝐏外,还需要维护两个大小为|𝑇|的向量,分别作为序列开始和结束时的转移概率。同时我们引入了一个掩码矩阵𝑚𝑎𝑠𝑘,将多个序列打包为一个Batch时填充的值忽略,使得Score计算仅包含有效的Token。

def compute_score(emissions, tags, seq_ends, mask, trans, start_trans, end_trans):
    # emissions: (seq_length, batch_size, num_tags)
    # tags: (seq_length, batch_size)
    # mask: (seq_length, batch_size)

    seq_length, batch_size = tags.shape
    mask = mask.astype(emissions.dtype)

    # 将score设置为初始转移概率
    # shape: (batch_size,)
    score = start_trans[tags[0]]
    # score += 第一次发射概率
    # shape: (batch_size,)
    score += emissions[0, mnp.arange(batch_size), tags[0]]

    for i in range(1, seq_length):
        # 标签由i-1转移至i的转移概率(当mask == 1时有效)
        # shape: (batch_size,)
        score += trans[tags[i - 1], tags[i]] * mask[i]

        # 预测tags[i]的发射概率(当mask == 1时有效)
        # shape: (batch_size,)
        score += emissions[i, mnp.arange(batch_size), tags[i]] * mask[i]

    # 结束转移
    # shape: (batch_size,)
    last_tags = tags[seq_ends, mnp.arange(batch_size)]
    # score += 结束转移概率
    # shape: (batch_size,)
    score += end_trans[last_tags]

    return score
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32

Cálculo do normalizador

Normalizer是𝑥对应的所有可能的输出序列的Score的对数指数和(Log-Sum-Exp)。此时如果按穷举法进行计算,则需要将每个可能的输出序列Score都计算一遍,共有|𝑇|𝑛个结果。这里我们采用动态规划算法,通过复用计算结果来提高效率。

假设需要计算从第00至第𝑖𝑖个Token所有可能的输出序列得分Score𝑖,则可以先计算出从第0至第𝑖−1个Token所有可能的输出序列得分Score𝑖−1。因此,Normalizer可以改写为以下形式:

Adicione a descrição da imagem

其中ℎ𝑖为第𝑖个Token的发射概率,𝐏是转移矩阵。由于发射概率矩阵ℎ和转移概率矩阵𝐏独立于𝑦的序列路径计算,可以将其提出,可得:

Adicione a descrição da imagem

def compute_normalizer(emissions, mask, trans, start_trans, end_trans):
    # emissions: (seq_length, batch_size, num_tags)
    # mask: (seq_length, batch_size)

    seq_length = emissions.shape[0]

    # 将score设置为初始转移概率,并加上第一次发射概率
    # shape: (batch_size, num_tags)
    score = start_trans + emissions[0]

    for i in range(1, seq_length):
        # 扩展score的维度用于总score的计算
        # shape: (batch_size, num_tags, 1)
        broadcast_score = score.expand_dims(2)

        # 扩展emission的维度用于总score的计算
        # shape: (batch_size, 1, num_tags)
        broadcast_emissions = emissions[i].expand_dims(1)

        # 根据公式(7),计算score_i
        # 此时broadcast_score是由第0个到当前Token所有可能路径
        # 对应score的log_sum_exp
        # shape: (batch_size, num_tags, num_tags)
        next_score = broadcast_score + trans + broadcast_emissions

        # 对score_i做log_sum_exp运算,用于下一个Token的score计算
        # shape: (batch_size, num_tags)
        next_score = ops.logsumexp(next_score, axis=1)

        # 当mask == 1时,score才会变化
        # shape: (batch_size, num_tags)
        score = mnp.where(mask[i].expand_dims(1), next_score, score)

    # 最后加结束转移概率
    # shape: (batch_size, num_tags)
    score += end_trans
    # 对所有可能的路径得分求log_sum_exp
    # shape: (batch_size,)
    return ops.logsumexp(score, axis=1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39

Foto do check-in:

Adicione a descrição da imagem