Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
S
sparrowzz
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
sgool
sparrowzz
Commits
8eaf827e
Commit
8eaf827e
authored
Feb 01, 2026
by
xuchengsi
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
TensorEval节点polars操作文档
parent
12daaf1c
隐藏空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
3269 行增加
和
0 行删除
+3269
-0
rustscript/RustDFScript手册.md
+3269
-0
没有找到文件。
rustscript/RustDFScript手册.md
0 → 100644
查看文件 @
8eaf827e
# RustDFScript 语言规范
# RustDFScript 语言规范
## 1. 语言概述
RustDfScript 是一种用于DataFrame操作的语言。
## 2. 词法规则
### 2.1 标识符
```
rustdfscript
// 有效标识符
variable_name
matrix1
bus_data
PQ_bus
```
### 2.2 数值字面量
```
rustdfscript
// 整数
42
-17
0
// 浮点数
3.14
-2.718
1.23e-4
6.022e23
// 科学记数法
1.5e10
-2.3E-5
```
### 2.3 字符串字面量
```
rustdfscript
// 字符串使用双引号
"Hello, World!"
"Power flow data for IEEE 14 bus"
"File path: /data/case14.txt"
```
### 2.4 数学常量约定
-
π:pi
-
自然常数:e
-
分母为零:NAN
-
正无穷大:INF
-
负无穷大:NEG_INF
### 2.5 注释
```
rustdfscript
// 单行注释
/* 多行注释 */
/*
* 块注释
* 支持多行
*/
```
### 2.6 数据类型
```
rustdfscript
null
i8
i16
i32
i64
i128
u8
u16
u32
u64
u128
f32
f64
bool
binary
str
date
time
```
### 2.6 创建变量并赋值
```
rustdfscript
len = height(input1);
output = with_columns(alias(cast(arange(0, len) + 1, u32), index), input1);
```
## 3. 对DataFrame整体进行操作的函数
### 3.1 返回值不是DataFrame的函数
#### 3.1.1 获取DataFrame的行数
```
rustdfscript
函数
height
语法
height(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+----------+
| Element | Proton | Electron |
| --- | --- | --- |
| str | i32 | i32 |
+=========+========+==========+
| Copper | 29 | 29 |
+---------+--------+----------+
| Silver | 47 | 47 |
+---------+--------+----------+
| Gold | 79 | 79 |
+---------+--------+----------+
结果 3
```
#### 3.1.2 获取DataFrame的列数
```
rustdfscript
函数
width
语法
width(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+
| Element | Proton |
| --- | --- |
| str | i32 |
+=========+========+
| Copper | 29 |
+---------+--------+
| Silver | 47 |
+---------+--------+
| Gold | 79 |
+---------+--------+
结果 2
```
#### 3.1.3 获取DataFrame的元素总数
```
rustdfscript
函数
size
语法
size(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+
| Element | Proton |
| --- | --- |
| str | i32 |
+=========+========+
| Copper | 29 |
+---------+--------+
| Silver | 47 |
+---------+--------+
| Gold | 79 |
+---------+--------+
结果 6
```
**注意:返回值不是DataFrame的函数,不能嵌套在返回值是DataFrame的函数中使用。**
### 3.2 对单个DataFrame进行操作的函数
#### 3.2.1 选择列
```
rustdfscript
函数
select
语法
select(col(col_name), input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+----------+
| Element | Proton | Electron |
| --- | --- | --- |
| str | i32 | i32 |
+=========+========+==========+
| Copper | 29 | 29 |
+---------+--------+----------+
| Silver | 47 | 47 |
+---------+--------+----------+
| Gold | 79 | 79 |
+---------+--------+----------+
col_name: Element
结果
+---------+
| Element |
| --- |
| str |
+=========+
| Copper |
+---------+
| Silver |
+---------+
| Gold |
+---------+
```
#### 3.2.2 根据条件筛选行
```
rustdfscript
函数
filter
语法
filter(col(col_name), condition_expr, input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
condition_expr - 筛选条件表达式
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 29 |
+---------+--------+
| B | 47 |
+---------+--------+
| C | 79 |
+---------+--------+
| D | 79 |
+---------+--------+
| E | 79 |
+---------+--------+
col_name: lit
condition_expr:col(value)>2
结果
+---------+--------+
| lit |
| --- |
| str |
+=========+
| B |
+---------+
| C |
+---------+
| E |
+---------+
```
#### 3.2.3 添加或替换列
```
rustdfscript
函数
with_columns
语法
with_columns(expr, input_df);
输入参数
expr - 表达式
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(replace(col(name), James, Jordan), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | Jordan |
+---------+--------+
| C | Curry |
+---------+--------+
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(alias(replace(col(name), James, Jordan), new_name), input_df)
结果
+---------+--------+
| lit | name | new_name |
| --- | --- | --- |
| str | str | str |
+=========+========+========+
| A | Kobe | Kobe |
+---------+--------+--------+
| B | James | Jordan |
+---------+--------+--------+
| C | Curry | Curry |
+---------+--------+--------+
```
#### 3.2.4 数据分组
```
rustdfscript
函数
group_by
语法
group_by(col(col_name), input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 分组的列名
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| A | 2 |
+---------+--------+
| B | 3 |
+---------+--------+
| B | 4 |
+---------+--------+
col_name: lit
condition_expr:cum_sum(value)
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | list[f64] |
+=========+========+
| A | [1,2] |
+---------+--------+
| B | [3,4] |
+---------+--------+
```
#### 3.2.5 删除包含空值的行
```
rustdfscript
函数
drop_nulls
语法
drop_nulls(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.6 删除包含无效数值NaN的行
```
rustdfscript
函数
drop_nans
语法
drop_nans(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.7 填充空值
```
rustdfscript
函数
fill_null
语法
fill_null(fill_value, input_df);
输入参数
fill_value - 填充值
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
fill_value:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.8 填充NaN值
```
rustdfscript
函数
fill_nan
语法
fill_nan(fill_value, input_df);
输入参数
fill_value - 填充值
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
fill_value:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.9 统计每列非空值数量
```
rustdfscript
函数
count
语法
count(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| i64 | i64 |
+=========+========+
| 3 | 2 |
+---------+--------+
```
#### 3.2.10 统计每列空值数量
```
rustdfscript
函数
null_count
语法
null_count(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| i64 | i64 |
+=========+========+
| 0 | 1 |
+---------+--------+
```
#### 3.2.11 获取第1行
```
rustdfscript
函数
first
语法
first(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
```
#### 3.2.12 获取最后1行
```
rustdfscript
函数
last
语法
last(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| C | 3 |
+---------+--------+
```
#### 3.2.13 DataFrame倒序
```
rustdfscript
函数
reverse
语法
reverse(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| C | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| A | 1 |
+---------+--------+
```
#### 3.2.14 数据类型转换
```
rustdfscript
函数
cast_all
语法
cast_all(dtype, input_df);
输入参数
dtype - 数据类型
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
dtype:str
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | str |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
```
#### 3.2.15 将DataFrame限制为前n行
```
rustdfscript
函数
limit
语法
limit(n, input_df);
输入参数
n - 行数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.16 将DataFrame限制为最后n行
```
rustdfscript
函数
tail
语法
tail(n, input_df);
输入参数
n - 行数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
```
#### 3.2.17 对各列求和
```
rustdfscript
函数
sum
语法
sum(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 6 | 15 |
+---------+--------+
```
#### 3.2.18 求各列最大值
```
rustdfscript
函数
max
语法
max(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 3 | 6 |
+---------+--------+
```
#### 3.2.19 求各列最小值
```
rustdfscript
函数
min
语法
min(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
```
#### 3.2.20 求各列平均值
```
rustdfscript
函数
mean
语法
mean(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
```
#### 3.2.21 求各列中位数
```
rustdfscript
函数
median
语法
median(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 5 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
```
#### 3.2.22 位移操作
```
rustdfscript
函数
shift
语法
shift(n, input_df);
输入参数
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| None | None |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.23 位移操作(用None填充空出来的位置)
```
rustdfscript
函数
shift
语法
shift(n, input_df);
输入参数
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:1
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| None | None |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.24 位移操作(用给定值填充空出来的位置)
```
rustdfscript
函数
shift_and_fill
语法
shift_and_fill(fill_value, n, input_df);
输入参数
fill_value - 填充值
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:1
fill_value:-1
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| -1 | -1 |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
### 3.3 对多个DataFrame进行操作的函数
#### 3.3.1 连接DataFrame
```
rustdfscript
函数
join
语法
join(df1, df2, left_on, right_on, how);
输入参数
df1 - 第1个DataFrame
df2 - 第2个DataFrame
left_on - 左表连接键
right_on - 右表连接键
how - 连接方式,可取inner、full、left或right
示例
输入的df1:
+---------+--------+
| lit | name1 |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
输入的df2:
+---------+--------+
| name2 | value |
| --- | --- |
| str | f64 |
+=========+========+
| Kobe | 1 |
+---------+--------+
| Jordan | 2 |
+---------+--------+
| Curry | 3 |
+---------+--------+
left_on: name1
right_on: name2
how:inner
结果
+---------+--------+--------+
| lit | name1 | value |
| --- | --- | --- |
| str | str | f64 |
+=========+========+--------+
| A | Kobe | 1 |
+---------+--------+--------+
| C | Curry | 3 |
+---------+--------+--------+
```
#### 3.3.2 排序
```
rustdfscript
函数
sort
语法
sort(by, descending, maintain_order, input_df);
输入参数
by - 排序的列名(列表)
descending - 排序方向(列表),true为升序,false为降序
maintain_order - 是否保持相等元素的原始顺序,true为保持,false为不保持
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 2 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 5 |
+---------+--------+
by:[value]
descending:[true]
maintain_order:true
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| E | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| A | 2 |
+---------+--------+
| D | 1 |
+---------+--------+
```
#### 3.3.3 拼接DataFrame
```
rustdfscript
函数
concat
语法
concat(how, df1, df2);
输入参数
df1 - 第1个DataFrame
df2 - 第2个DataFrame
how - 拼接方式,可取horizontal或vertical等
示例
输入的df1:
+---------+--------+
| lit | name1 |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
输入的df2:
+---------+--------+
| name2 | value |
| --- | --- |
| str | f64 |
+=========+========+
| Kobe | 1 |
+---------+--------+
| Jordan | 2 |
+---------+--------+
| Curry | 3 |
+---------+--------+
how:horizontal
结果
+---------+--------+--------+--------+
| lit | name1 | name2 | value |
| --- | --- | --- | --- |
| str | str | str | f64 |
+=========+========+=========+========+
| A | Kobe | Kobe | 1 |
+---------+--------+--------+--------+
| B | James | Jordan | 2 |
+---------+--------+--------+--------+
| C | Curry | Curry | 3 |
+---------+--------+--------+--------+
```
#### 3.3.4 数据透视(将长格式数据转换为宽格式)
```
rustdfscript
函数
pivot
语法
pivot(on, index, values, sort_columns, agg_expr, sep, input_df);
输入参数
on - 透视列(列表)
descending - 行索引列(列表)
values - 值列(列表)
sort_columns - 是否按字母顺序对生成的列名排序(布尔值)
agg_expr - 聚合表达式
sep - 生成列名的分隔符
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+--------+
| lit | year | value |
| --- | --- | --- |
| str | i64 | i64 |
+=========+========+========+
| A | 2021 | 1 |
+---------+--------+--------+
| A | 2022 | 2 |
+---------+--------+--------+
| A | 2023 | 3 |
+---------+--------+--------+
| B | 2021 | 4 |
+---------+--------+--------+
| B | 2022 | 5 |
+---------+--------+--------+
| B | 2023 | 6 |
+---------+--------+--------+
| B | 2023 | 7 |
+---------+--------+--------+
on:[year]
index:[lit]
values:[value]
sort_columns:false
agg_expr:
sep:_
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| E | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| A | 2 |
+---------+--------+
| D | 1 |
+---------+--------+
```
## 4. 对DataFrame列进行操作的函数
#### 4.1 列重命名
```
rustdfscript
函数
alias
语法
alias(col(col_name), new_col_name);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
new_col_name - 新列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
new_col_name:new_name
expr:with_columns(alias(replace(col(name), James, Jordan), new_name), input_df)
结果
+---------+--------+
| lit | name | new_name |
| --- | --- | --- |
| str | str | str |
+=========+========+========+
| A | Kobe | Kobe |
+---------+--------+--------+
| B | James | Jordan |
+---------+--------+--------+
| C | Curry | Curry |
+---------+--------+--------+
```
#### 4.2 条件逻辑
```
rustdfscript
函数
when_then_otherwise
语法
when_then_otherwise(condition, value_if_true, value_if_false);
输入参数
condition - 条件表达式
value_if_true - 条件成立赋值
value_if_false - 条件不成立赋值
示例
输入的input_df:
+---------+--------+
| columns | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(alias(when_then_otherwise(arange(0, 3) == 0, alias(e0,columns), col(columns)), columns), input_df)
结果
+---------+--------+
| columns | name |
| --- | --- |
| str | str |
+=========+========+
| e0 | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
```
### 4.3 逻辑取反
```
rustdfscript
函数
not
语法
not(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
表达式:with_columns(alias(not(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.4 判断非空
```
rustdfscript
函数
not_null
语法
not_null(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(not_null(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.5 判断为空
```
rustdfscript
函数
is_null
语法
is_null(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(is_null(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.6 判断非NaN
```
rustdfscript
函数
not_nan
语法
not_nan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(not_nan(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.7 判断为NaN
```
rustdfscript
函数
is_nan
语法
is_nan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(is_nab(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.8 删除空值
```
rustdfscript
函数
drop_nulls
语法
drop_nulls(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(drop_nulls(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.9 删除NaN值
```
rustdfscript
函数
drop_nans
语法
drop_nans(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(drop_nans(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.10 判断是否等于给定值
```
rustdfscript
函数
eq
语法
eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
表达式:with_columns(alias(eq(col(name), 1), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.11 判断是否不等于给定值
```
rustdfscript
函数
neq
语法
neq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
表达式:with_columns(alias(neq(col(name), 1), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.12 判断是否小于给定值
```
rustdfscript
函数
lt
语法
lt(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(lt(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.13 判断是否大于给定值
```
rustdfscript
函数
gt
语法
gt(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(gt(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.14 判断是否小于等于给定值
```
rustdfscript
函数
lt_eq
语法
lt_eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(lt_eq(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.15 判断是否大于等于给定值
```
rustdfscript
函数
gt_eq
语法
gt_eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(gt_eq(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.16 统计唯一值数量
```
rustdfscript
函数
n_unique
语法
n_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(n_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 2 |
+---------+
```
### 4.17 获取唯一值
```
rustdfscript
函数
unique
语法
unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.18 获取唯一值索引位置
```
rustdfscript
函数
arg_unique
语法
arg_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 0 |
+---------+
| 2 |
+---------+
| 3 |
+---------+
```
### 4.19 获取第1个最小值索引位置
```
rustdfscript
函数
arg_min
语法
arg_min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 0 |
+---------+
```
### 4.20 获取第1个最大值索引位置
```
rustdfscript
函数
arg_max
语法
arg_max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 2 |
+---------+
```
### 4.21 判断是否为唯一值
```
rustdfscript
函数
is_unique
语法
is_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(is_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| bool |
+========+
| false |
+---------+
| true |
+---------+
| false |
+---------+
| false |
+---------+
| false |
+---------+
```
### 4.22 判断是否为重复值
```
rustdfscript
函数
is_duplicated
语法
is_duplicated(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(is_duplicated(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| bool |
+========+
| true |
+---------+
| false |
+---------+
| true |
+---------+
| true |
+---------+
| true |
+---------+
```
### 4.23 数据倒序
```
rustdfscript
函数
reverse
语法
reverse(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(reverse(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 1 |
+---------+--------+
```
### 4.24 统计非空值数量
```
rustdfscript
函数
count
语法
count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 4 |
+---------+
```
### 4.25 统计空值数量
```
rustdfscript
函数
null_count
语法
null_count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(null_count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
```
### 4.26 获取长度
```
rustdfscript
函数
len
语法
len(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(len(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
```
### 4.27 求和
```
rustdfscript
函数
sum
语法
sum(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(sum(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 15 |
+---------+
```
### 4.28 获取最小值
```
rustdfscript
函数
min
语法
min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
```
### 4.29 获取最大值
```
rustdfscript
函数
max
语法
max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
```
### 4.30 计算平均值
```
rustdfscript
函数
mean
语法
mean(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(mean(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 3 |
+---------+
```
### 4.31 计算中位数
```
rustdfscript
函数
median
语法
median(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3.5 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(median(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 3.5 |
+---------+
```
### 4.32 计算标准差
```
rustdfscript
函数
std
语法
std(col(col_name), ddof)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
ddof - 自由度,0为总体标准差,1为样本标准差
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -1 |
+---------+--------+
| D | 5 |
+---------+--------+
| E | 4 |
+---------+--------+
表达式:select(alias(std(col(name), 1), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| sqrt(5.3) |
+---------+
```
### 4.33 计算方差
```
rustdfscript
函数
var
语法
var(col(col_name), ddof)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
ddof - 自由度,0为总体方差,1为样本方差
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -1 |
+---------+--------+
| D | 5 |
+---------+--------+
| E | 4 |
+---------+--------+
表达式:select(alias(var(col(name), 1), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5.3 |
+---------+
```
### 4.34 累积计数
```
rustdfscript
函数
cum_count
语法
cum_count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 2 |
+---------+
| 3 |
+---------+
| 3 |
+---------+
| 4 |
+---------+
```
### 4.35 累积求和
```
rustdfscript
函数
cum_sum
语法
cum_sum(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_sum(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
| 6 |
+---------+
| 10 |
+---------+
| 15 |
+---------+
```
### 4.36 累积最小值
```
rustdfscript
函数
cum_min
语法
cum_min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 6 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 7 |
+---------+--------+
表达式:select(alias(cum_min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
| 4 |
+---------+
| 4 |
+---------+
| 2 |
+---------+
| 2 |
+---------+
```
### 4.37 累积最大值
```
rustdfscript
函数
cum_max
语法
cum_max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 6 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 7 |
+---------+--------+
表达式:select(alias(cum_max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
| 5 |
+---------+
| 6 |
+---------+
| 6 |
+---------+
| 7 |
+---------+
```
### 4.38 累积乘积
```
rustdfscript
函数
cum_prod
语法
cum_prod(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_prod(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 2 |
+---------+
| 6 |
+---------+
| 24 |
+---------+
| 120 |
+---------+
```
### 4.39 绝对值
```
rustdfscript
函数
abs
语法
abs(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
表达式:with_columns(alias(abs(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.40 幂运算
```
rustdfscript
函数
pow
语法
pow(col(col_name), exponent)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
exponent - 指数
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
表达式:with_columns(alias(pow(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 9 |
+---------+--------+
| D | 16 |
+---------+--------+
| E | 25 |
+---------+--------+
```
### 4.41 自然指数
```
rustdfscript
函数
exp
语法
exp(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:with_columns(alias(exp(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | e^1 |
+---------+--------+
| B | e^2 |
+---------+--------+
| C | e^3 |
+---------+--------+
| D | e^4 |
+---------+--------+
| E | e^5 |
+---------+--------+
```
### 4.42 计算ln(1+x)
```
rustdfscript
函数
log1p
语法
log1p(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | e^1-1 |
+---------+--------+
| B | e^2-1 |
+---------+--------+
| C | e^3-1 |
+---------+--------+
| D | e^4-1 |
+---------+--------+
| E | e^5-1 |
+---------+--------+
表达式:with_columns(alias(log1p(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.43 对数函数
```
rustdfscript
函数
log
语法
log(col(col_name), base)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
base - 底数
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 4 |
+---------+--------+
| D | 8 |
+---------+--------+
| E | 16 |
+---------+--------+
表达式:with_columns(alias(log(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 0 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
| D | 3 |
+---------+--------+
| E | 4 |
+---------+--------+
```
### 4.44 平方根
```
rustdfscript
函数
sqrt
语法
sqrt(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 9 |
+---------+--------+
| D | 16 |
+---------+--------+
| E | 26 |
+---------+--------+
表达式:with_columns(alias(sqrt(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.45 立方根
```
rustdfscript
函数
cbrt
语法
cbrt(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 8 |
+---------+--------+
| C | -27 |
+---------+--------+
| D | 64 |
+---------+--------+
| E | -125 |
+---------+--------+
表达式:with_columns(alias(cbrt(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
```
### 4.46 正弦函数
```
rustdfscript
函数
sin
语法
sin(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.47 余弦函数
```
rustdfscript
函数
cos
语法
cos(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.48 正切函数
```
rustdfscript
函数
tan
语法
tan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.49 余切函数
```
rustdfscript
函数
cot
语法
cot(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.50 反正弦函数
```
rustdfscript
函数
arcsin
语法
arcsin(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.51 反余弦函数
```
rustdfscript
函数
arccos
语法
arccos(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.52 反正切函数
```
rustdfscript
函数
arctan
语法
arctan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.53 双曲正弦函数
```
rustdfscript
函数
sinh
语法
sinh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.54 双曲余弦函数
```
rustdfscript
函数
cosh
语法
cosh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.55 双曲正切函数
```
rustdfscript
函数
tanh
语法
tanh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.56 反双曲正弦函数
```
rustdfscript
函数
arcsinh
语法
arcsinh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.57 反双曲余弦函数
```
rustdfscript
函数
arccosh
语法
arccosh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.58 反双曲正切函数
```
rustdfscript
函数
arctanh
语法
arctanh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.59 弧度转角度
```
rustdfscript
函数
degrees
语法
degrees(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.60 角度转弧度
```
rustdfscript
函数
radians
语法
radians(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.61 向下取整函数
```
rustdfscript
函数
floor
语法
floor(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.62 向上取整函数
```
rustdfscript
函数
ceil
语法
ceil(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.63 四舍五入函数
```
rustdfscript
函数
round
语法
round(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.64 符号函数
```
rustdfscript
函数
sign
语法
sign(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.65 内积
```
rustdfscript
函数
dot
语法
dot(col(col_name1), col(col_name2))
输入参数
col - 引用DataFrame列的函数
col_name - 列名1
col_name - 列名2
示例
输入的input_df:
+---------+--------+
| t1 | t2 |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 2 |
+---------+--------+
| 2 | 3 |
+---------+--------+
| 3 | 4 |
+---------+--------+
| 4 | 5 |
+---------+--------+
| 5 | 6 |
+---------+--------+
表达式:select(alias(dot(col(t1), col(t2)), t1), input_df)
结果
+---------+
| t1 |
| --- |
| f64 |
+=========+
| 70 |
+---------+
```
\ No newline at end of file
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论