Change History#
v5.0#
Release date: TBD
- Added math operators:
absolute(generated),acos,arcsinh(generated),arcsinh_(generated),arcsinh.out(generated),arctanh_(generated),asinh_(generated),ceil,ceil_,ceil.out,diagmma_(generated),equal,floor_(generated),fmin(generated),fmin.out(generated)hardswish_(generated),hypot(generated),i0(generated),i0_(generated),i0.out(generated),log1p_(generated),logaddexp(generated),logaddexp.out(generated),logical_and_,logical_or_logit(generated),logit_(generated),logit.out(generated),sgn_(generated),sinh_(generated),special_i1(generated),special_i1.out(generated) - Added BLAS operator:
bmm.out,cutlass_scaled_mm_sm_90,tril(generated) - Added MoE operators:
dispatch_fused_moe_kernel,grouped_topk,inplace_fused_experts,outplace_fused_experts - Added distribution operator:
normal_, - Added neural network operators:
_upsample_nearest_exact1d,apply_repetition_penalties(generated),chunk_gated_delta_rule_fwd,dswiglu,embedding_dense_backward,fused_recurrent_gated_delta_rule_fwd,hardsigmoid(generated),hardsigmoid.out(generated),nll_loss_nd_backward,nll_loss_nd_forward,one_hot,pixel_unshuffle(generated),pixel_unshuffle.out(generated),prelu(generated),reflection_pad1d(generated),reflection_pad1d.out(generated),reflection_pad2d(generated),reflection_pad2d.out(generated),relu6(generated),swiglu,triu_,unfold_backward,upsample_bicubic2d,upsample_linear1d,upsample_nearest1d,upsample_nearest3d - Added tensor operators:
_functional_sym_constrain_range_for_size(generated),alias_copy(generated),alias_copy.out(generated),fill.Scalar_out,fill.Tensor_out,lift_fresh_copy(generated),replication_pad1d(generated),replication_pad1d.out(generated),replication_pad3d,rrelu_with_noise_backward(generated),selu(generated),selu_(generated),slice_backward(generated),softshrink(generated),softshrink.out(generated),t_copy(generated),t_copy.out(generated),unfold_backward,zero(generated),zero_,zero.out(generated) - Removed reduction operator
moe_sum. - Added reduction operators:
bincount - Added DSA operators:
spare_mla_fwd
v4.2#
Release date: 2026-01-04
- This release contains 216 operators, see Operator List
- Added neural network operators:
avg_pool2d,avg_pool2d_backward,dgeglu(alpha),dreglu(alpha),geglu(alpha),reglu(alpha), - Added BLAS operators:
baddbmm, - Added convolution operators:
conv1d,conv2d,conv3d, - Added tensor operator:
copy_,masked_scatter,masked_scatter_,per_token_group_quant_fp8(alpha),scatter_add_,to_copy. - Added reduction operator:
moe_sum,scaled_softmax_backward,scaled_softmax_forward, - Added math operator:
exp_out,tan,tan_,true_divide_out, - Promoted tensor operator
continuousto beta stage. - Promoted reduction operator
indexto stable stage. - The previous
upsampleoperator is now split intoupsample_nearest2dandupsample_bicubic2d_aa
v4.1#
Release date: 2025-11-01
- Added two fused RWKV operators:
rwkv_ka_fusion,rwkv_mm_sparsity - Adopted by the RWKV project in BlinkDL/Albatross:faster_251101
v4.0#
Release date: 2025-10-31
Supports 202 operators in total.
Added linear algebra operators:
addcdiv,addcmul,addmv,addmv_out,addrAdded BLAS operators:
addmm_outAdded math operators:
atan,atan_,bitwise_left_shift,bitwise_right_shift,clamp_min,clamp_min_,exp2,exp2_,sqrt_Added neural network operators:
celu,celu_,elu_,elu_backward,get_scheduler_metadata,glu_backward,moe_align_block_size,softplusAdded reduction operators:
index(beta),std,traceAdded tensor operators:
index_add_,logspace,max_pool2d_backward,max_pool2d_with_indices,topk_softmaxTriton JIT C++ runtime now ships precompiled kernels for:
add,addmm,argmax,bmm,bmm_out,cat,contiguous,embedding,exponential_,fill,flash_attn_varlen_func,fused_add_rms_norm,max,mm,nonzero,reshape_and_cache,reshape_and_cache_flash,rms_norm,rms_norm_backward,rotary_embedding,softmax,sum,topk,zeros
v3.0#
Release date: 2025-07-14
- Support 184 operators in total, including custom operators used in large model inference
- New hardware platforms supported: Ascend, AIPU, etc.
- Added compatibility with the vLLM framework, with the inference verification of DeepSeek model passed
- Added BLAS operator:
dot,mm_out - Added tensor operators:
index_put_,scatter_,sort_stable - Added linear algebra operators:
lerp,sum_dim_out,sum_out - Added math operators:
angle,nan_to_num,polar,tanh_backward - Added reduction operators:
cummax,cumsum_out,eye,index(alpha),layer_norm_backward,softmax_backward - Added neural network operators:
batch_norm,batch_norm_backward,dropout_backward,embedding_backward,flash_attention_forward,gelu_backward,glu,group_norm_backward,log_softmax(stable),log_softmax_backward,sigmoid_backward,silu_backward,threshold,threshold_backward,weight_norm_interface_backward - Removed neural network operator:
cross_entropy_loss,instance_norm - Removed linear algebra operators:
outer(fused),
v2.2#
Release date: 2025-04-17
- Added BLAS operators:
vdot - Added tensor operators:
cat,constant_pad_nd,contiguous(alpha),diag,diag_embed,fill,fill_,hstack,index_add,index_put,isin,kron,linspace,masked_fill,masked_fill_,quantile,repeat_interleave_self_tensor,repeat_interleave_tensor,scatter,select_scatter,slice_scatter,sort,stack,vstack,where_out(stable), - Added linear algebra operators:
diagonal_backward - Added convolution operators:
_conv_depthwise2d, - Added neural network operators:
_upsample_bicubic2d_aa,elu,gelu_,instance_norm(fused),mse_loss,nll_loss_backward,nll_loss_forward,nll_loss2d_backward,nll_loss2d_forward,relu_,scaled_dot_product_attention,scaled_dot_product_attention_backward,scaled_dot_product_attention_forward,sigmoid_,silu_,upsample_nearest2d,weight_norm_interface, - Added math operators:
abs_,add_,bitwise_and_,bitwise_not_,bitwise_or_,clamp_,cos_,div_mode_,exp_,floor_divide_,log,log_sigmoid,logical_and,logical_not,logical_or,logical_xor,mul_,pow_,reciprocal_,remainder,remainder_,rsqrt_,sin_,sub_,tanh_, - Added reduction operators:
argmin,count_nonzero,cummin,gather,gather_backward, - Added science operators:
erf_, - Added distribution operators:
randperm,
v2.1#
Release date: 2024-09-05
- Added Tensor operators:
_unique2,arange,flip,full,full_like,index_select,masked_fill(alpha),masked_select,ones,ones_like,pad,repeat,tile,unique,where,zeros,zeros_like - support neural network operator:
embedding - support basic math operators:
allclose,floor_divide,isclose,isfinite,maximum,minimum,true_divide,trunc_divide - support distribution operators:
exponential_,multinomial,nonzero,normal,topk,rand,rand_like,randn,randn_like,topk,uniform_ - Added science operators:
erf,resolve_conj,resolve_neg
v2.0#
Release date: 2024-05-31
- Added BLAS operators:
mv,outer - Added math operators:
bitwise_and,bitwise_not,bitwise_or,clamp,cos,eq,ge,gt,isinf,isnan,le,lt,ne,neg,or,sigmoidsin,tanh, - Added reduction operators:
all,amax,any,argmax,cross_entropy_loss,group_norm,log_softmax(alpha),max,min,prod,rms_norm,rms_norm_backward,rms_norm_forward,sum,var_mean,vector_norm, - Added neural network operators:
apply_rotary_position_embedding,fused_add_rms_norm,gelu_and_mul,outer(alpha),silu_and_mul,silu_and_mul_out,skip_layer_norm
v1.0#
Release date: 2024-05-10
- Added BLAS operators:
addmm,bmm,mm - Added math operators:
abs,add,div,dropout,exp,gelu,mul,pow,reciprocal,relu,rsqrt,silu,sub,triu - Added reduction operators:
cumsum,layer_norm,mean,softmax