Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction

  1. Motivation
  2. GDCN
  3. Code
  4. Reference

Motivation

1.提出一个 DCN V2 的升级版本: gdcn, 引入了信息门控组件,自适应地学习上一层阶交叉结果的重要性

GDCN

1.GDCN 核心实现

x_{l+1}=x_0 * (w_l+b_l) * sigmoid(w_g * x_l) + x_l

intuitively,
1.GDCN 结构在 DCN 结构上引入了信息门控组件,自适应地学习上一层阶交叉结果的重要性: 我们期望该过程可以放大更重要特征,减轻不重要特征的影响; 随着交叉层数量的增加,每个交叉层的信息门过滤下一阶交叉特征,并有效地控制信息流

Code

自己实现的 tf 版本

"""
tensorlow online version
"""
def dcn_stack_net(self, input_layer, cross_layer_num, output_size):
  """
  x_l = x_0 \odot (W \cdot x_l + b) + x_l
  \odot 代表 element-wise 的tensor 乘法, 通常 tf.multiply() 实现
  """
  name = 'dcn'
  act_fun = tf.nn.tanh
  with tf.variable_scope("dcn_stack_net"):
    x_0 = input_layer
    x_l = x_0
    input_size = input_layer.shape1.value
    for i in range(cross_layer_num):
      w      = tf.get_variable(shape=[input_size, input_size], name=f"{name}_kernerl_{i+1}", initializer=tf.random_normal_initializer(stddev=1.0 / math.sqrt(float(input_size))), trainable=True)
      b      = tf.get_variable(shape=[input_size], name=f"{name}_bias_{i+1}", initializer=tf.zeros_initializer, trainable=True)
      dot_   = tf.multiply(x_0, tf.add(tf.matmul(x_l, w), b))
      x_l    = tf.add(dot_, x_l)
      # x_l    = act_fun(x_l)
    output_w = tf.get_variable(shape=[input_size, output_size], name=f"{name}_output_w")
    output_layer = tf.matmul(x_l, output_w)
    return output_layer

def gdcn_stack_net(self, input_layer, cross_layer_num, output_size):
  """
  formulaiton: c_{l+1} = c_0 \odot (w_l + b_l) \odot sigmoid(w_g \cdot x_l) + x_l
  code: x_l = x_0 \odot (w_l + b_l) \odot sigmoid(w_g \cdot x_l) + x_l
  """
  name = 'gdcn'
  with tf.variable_scope("gdcn_stack_net"):
    x_0 = input_layer
    x_l = x_0
    input_size = input_layer.shape1.value
    for i in range(cross_layer_num):
      wl = tf.get_variable(shape=[input_size, input_size], name=f"{name}_wl_{i+1}", initializer=tf.random_normal_initializer(stddev=1.0 / math.sqrt(float(input_size))), trainable=True)
      wg = tf.get_variable(shape=[input_size, input_size], name=f"{name}_wg_{i+1}", initializer=tf.random_normal_initializer(stddev=1.0 / math.sqrt(float(input_size))), trainable=True)
      b  = tf.get_variable(shape=[input_size], name=f"{name}_b_{i+1}", initializer=tf.zeros_initializer, trainable=True)
      dot1 = tf.multiply(x_0, tf.add(tf.matmul(x_l, wl), b))
      dot2 = tf.nn.sigmoid(tf.matmul(x_l, wg))
      x_l  = tf.add(tf.multiply(dot1, dot2), x_l)
    output_w = tf.get_variable(shape=[input_size, output_size], name=f"{name}_output_w")
    output_layer = tf.matmul(x_l, output_w)
    return output_layer

Reference

[1]. Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction.


转载请注明来源, from goldandrabbit.github.io

💰

×

Help us with donation