Commit 8eaf827e by xuchengsi

TensorEval节点polars操作文档

parent 12daaf1c
# RustDFScript 语言规范
# RustDFScript 语言规范
## 1. 语言概述
RustDfScript 是一种用于DataFrame操作的语言。
## 2. 词法规则
### 2.1 标识符
```rustdfscript
// 有效标识符
variable_name
matrix1
bus_data
PQ_bus
```
### 2.2 数值字面量
```rustdfscript
// 整数
42
-17
0
// 浮点数
3.14
-2.718
1.23e-4
6.022e23
// 科学记数法
1.5e10
-2.3E-5
```
### 2.3 字符串字面量
```rustdfscript
// 字符串使用双引号
"Hello, World!"
"Power flow data for IEEE 14 bus"
"File path: /data/case14.txt"
```
### 2.4 数学常量约定
- π:pi
- 自然常数:e
- 分母为零:NAN
- 正无穷大:INF
- 负无穷大:NEG_INF
### 2.5 注释
```rustdfscript
// 单行注释
/* 多行注释 */
/*
* 块注释
* 支持多行
*/
```
### 2.6 数据类型
```rustdfscript
null
i8
i16
i32
i64
i128
u8
u16
u32
u64
u128
f32
f64
bool
binary
str
date
time
```
### 2.6 创建变量并赋值
```rustdfscript
len = height(input1);
output = with_columns(alias(cast(arange(0, len) + 1, u32), index), input1);
```
## 3. 对DataFrame整体进行操作的函数
### 3.1 返回值不是DataFrame的函数
#### 3.1.1 获取DataFrame的行数
```rustdfscript
函数
height
语法
height(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+----------+
| Element | Proton | Electron |
| --- | --- | --- |
| str | i32 | i32 |
+=========+========+==========+
| Copper | 29 | 29 |
+---------+--------+----------+
| Silver | 47 | 47 |
+---------+--------+----------+
| Gold | 79 | 79 |
+---------+--------+----------+
结果 3
```
#### 3.1.2 获取DataFrame的列数
```rustdfscript
函数
width
语法
width(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+
| Element | Proton |
| --- | --- |
| str | i32 |
+=========+========+
| Copper | 29 |
+---------+--------+
| Silver | 47 |
+---------+--------+
| Gold | 79 |
+---------+--------+
结果 2
```
#### 3.1.3 获取DataFrame的元素总数
```rustdfscript
函数
size
语法
size(input_df);
输入参数
input_df - 为DataFrame
示例
输入的input_df:
+---------+--------+
| Element | Proton |
| --- | --- |
| str | i32 |
+=========+========+
| Copper | 29 |
+---------+--------+
| Silver | 47 |
+---------+--------+
| Gold | 79 |
+---------+--------+
结果 6
```
**注意:返回值不是DataFrame的函数,不能嵌套在返回值是DataFrame的函数中使用。**
### 3.2 对单个DataFrame进行操作的函数
#### 3.2.1 选择列
```rustdfscript
函数
select
语法
select(col(col_name), input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+----------+
| Element | Proton | Electron |
| --- | --- | --- |
| str | i32 | i32 |
+=========+========+==========+
| Copper | 29 | 29 |
+---------+--------+----------+
| Silver | 47 | 47 |
+---------+--------+----------+
| Gold | 79 | 79 |
+---------+--------+----------+
col_name: Element
结果
+---------+
| Element |
| --- |
| str |
+=========+
| Copper |
+---------+
| Silver |
+---------+
| Gold |
+---------+
```
#### 3.2.2 根据条件筛选行
```rustdfscript
函数
filter
语法
filter(col(col_name), condition_expr, input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
condition_expr - 筛选条件表达式
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 29 |
+---------+--------+
| B | 47 |
+---------+--------+
| C | 79 |
+---------+--------+
| D | 79 |
+---------+--------+
| E | 79 |
+---------+--------+
col_name: lit
condition_expr:col(value)>2
结果
+---------+--------+
| lit |
| --- |
| str |
+=========+
| B |
+---------+
| C |
+---------+
| E |
+---------+
```
#### 3.2.3 添加或替换列
```rustdfscript
函数
with_columns
语法
with_columns(expr, input_df);
输入参数
expr - 表达式
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(replace(col(name), James, Jordan), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | Jordan |
+---------+--------+
| C | Curry |
+---------+--------+
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(alias(replace(col(name), James, Jordan), new_name), input_df)
结果
+---------+--------+
| lit | name | new_name |
| --- | --- | --- |
| str | str | str |
+=========+========+========+
| A | Kobe | Kobe |
+---------+--------+--------+
| B | James | Jordan |
+---------+--------+--------+
| C | Curry | Curry |
+---------+--------+--------+
```
#### 3.2.4 数据分组
```rustdfscript
函数
group_by
语法
group_by(col(col_name), input_df);
输入参数
col - 引用DataFrame列的函数
col_name - 分组的列名
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| A | 2 |
+---------+--------+
| B | 3 |
+---------+--------+
| B | 4 |
+---------+--------+
col_name: lit
condition_expr:cum_sum(value)
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | list[f64] |
+=========+========+
| A | [1,2] |
+---------+--------+
| B | [3,4] |
+---------+--------+
```
#### 3.2.5 删除包含空值的行
```rustdfscript
函数
drop_nulls
语法
drop_nulls(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.6 删除包含无效数值NaN的行
```rustdfscript
函数
drop_nans
语法
drop_nans(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.7 填充空值
```rustdfscript
函数
fill_null
语法
fill_null(fill_value, input_df);
输入参数
fill_value - 填充值
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
fill_value:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.8 填充NaN值
```rustdfscript
函数
fill_nan
语法
fill_nan(fill_value, input_df);
输入参数
fill_value - 填充值
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
fill_value:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
```
#### 3.2.9 统计每列非空值数量
```rustdfscript
函数
count
语法
count(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| i64 | i64 |
+=========+========+
| 3 | 2 |
+---------+--------+
```
#### 3.2.10 统计每列空值数量
```rustdfscript
函数
null_count
语法
null_count(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | None |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| i64 | i64 |
+=========+========+
| 0 | 1 |
+---------+--------+
```
#### 3.2.11 获取第1行
```rustdfscript
函数
first
语法
first(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
```
#### 3.2.12 获取最后1行
```rustdfscript
函数
last
语法
last(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| C | 3 |
+---------+--------+
```
#### 3.2.13 DataFrame倒序
```rustdfscript
函数
reverse
语法
reverse(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| C | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| A | 1 |
+---------+--------+
```
#### 3.2.14 数据类型转换
```rustdfscript
函数
cast_all
语法
cast_all(dtype, input_df);
输入参数
dtype - 数据类型
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
dtype:str
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | str |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
```
#### 3.2.15 将DataFrame限制为前n行
```rustdfscript
函数
limit
语法
limit(n, input_df);
输入参数
n - 行数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.16 将DataFrame限制为最后n行
```rustdfscript
函数
tail
语法
tail(n, input_df);
输入参数
n - 行数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:2
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
```
#### 3.2.17 对各列求和
```rustdfscript
函数
sum
语法
sum(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 6 | 15 |
+---------+--------+
```
#### 3.2.18 求各列最大值
```rustdfscript
函数
max
语法
max(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 3 | 6 |
+---------+--------+
```
#### 3.2.19 求各列最小值
```rustdfscript
函数
min
语法
min(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
```
#### 3.2.20 求各列平均值
```rustdfscript
函数
mean
语法
mean(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
```
#### 3.2.21 求各列中位数
```rustdfscript
函数
median
语法
median(input_df);
输入参数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 5 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 2 | 5 |
+---------+--------+
```
#### 3.2.22 位移操作
```rustdfscript
函数
shift
语法
shift(n, input_df);
输入参数
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| None | None |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.23 位移操作(用None填充空出来的位置)
```rustdfscript
函数
shift
语法
shift(n, input_df);
输入参数
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:1
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| None | None |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
#### 3.2.24 位移操作(用给定值填充空出来的位置)
```rustdfscript
函数
shift_and_fill
语法
shift_and_fill(fill_value, n, input_df);
输入参数
fill_value - 填充值
n - 移动位数
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
| 3 | 6 |
+---------+--------+
n:1
fill_value:-1
结果
+---------+--------+
| lit | value |
| --- | --- |
| f64 | f64 |
+=========+========+
| -1 | -1 |
+---------+--------+
| 1 | 4 |
+---------+--------+
| 2 | 5 |
+---------+--------+
```
### 3.3 对多个DataFrame进行操作的函数
#### 3.3.1 连接DataFrame
```rustdfscript
函数
join
语法
join(df1, df2, left_on, right_on, how);
输入参数
df1 - 第1个DataFrame
df2 - 第2个DataFrame
left_on - 左表连接键
right_on - 右表连接键
how - 连接方式,可取inner、full、left或right
示例
输入的df1:
+---------+--------+
| lit | name1 |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
输入的df2:
+---------+--------+
| name2 | value |
| --- | --- |
| str | f64 |
+=========+========+
| Kobe | 1 |
+---------+--------+
| Jordan | 2 |
+---------+--------+
| Curry | 3 |
+---------+--------+
left_on: name1
right_on: name2
how:inner
结果
+---------+--------+--------+
| lit | name1 | value |
| --- | --- | --- |
| str | str | f64 |
+=========+========+--------+
| A | Kobe | 1 |
+---------+--------+--------+
| C | Curry | 3 |
+---------+--------+--------+
```
#### 3.3.2 排序
```rustdfscript
函数
sort
语法
sort(by, descending, maintain_order, input_df);
输入参数
by - 排序的列名(列表)
descending - 排序方向(列表),true为升序,false为降序
maintain_order - 是否保持相等元素的原始顺序,true为保持,false为不保持
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| A | 2 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 5 |
+---------+--------+
by:[value]
descending:[true]
maintain_order:true
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| E | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| A | 2 |
+---------+--------+
| D | 1 |
+---------+--------+
```
#### 3.3.3 拼接DataFrame
```rustdfscript
函数
concat
语法
concat(how, df1, df2);
输入参数
df1 - 第1个DataFrame
df2 - 第2个DataFrame
how - 拼接方式,可取horizontal或vertical等
示例
输入的df1:
+---------+--------+
| lit | name1 |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
输入的df2:
+---------+--------+
| name2 | value |
| --- | --- |
| str | f64 |
+=========+========+
| Kobe | 1 |
+---------+--------+
| Jordan | 2 |
+---------+--------+
| Curry | 3 |
+---------+--------+
how:horizontal
结果
+---------+--------+--------+--------+
| lit | name1 | name2 | value |
| --- | --- | --- | --- |
| str | str | str | f64 |
+=========+========+=========+========+
| A | Kobe | Kobe | 1 |
+---------+--------+--------+--------+
| B | James | Jordan | 2 |
+---------+--------+--------+--------+
| C | Curry | Curry | 3 |
+---------+--------+--------+--------+
```
#### 3.3.4 数据透视(将长格式数据转换为宽格式)
```rustdfscript
函数
pivot
语法
pivot(on, index, values, sort_columns, agg_expr, sep, input_df);
输入参数
on - 透视列(列表)
descending - 行索引列(列表)
values - 值列(列表)
sort_columns - 是否按字母顺序对生成的列名排序(布尔值)
agg_expr - 聚合表达式
sep - 生成列名的分隔符
input_df - 输入的DataFrame
示例
输入的input_df:
+---------+--------+--------+
| lit | year | value |
| --- | --- | --- |
| str | i64 | i64 |
+=========+========+========+
| A | 2021 | 1 |
+---------+--------+--------+
| A | 2022 | 2 |
+---------+--------+--------+
| A | 2023 | 3 |
+---------+--------+--------+
| B | 2021 | 4 |
+---------+--------+--------+
| B | 2022 | 5 |
+---------+--------+--------+
| B | 2023 | 6 |
+---------+--------+--------+
| B | 2023 | 7 |
+---------+--------+--------+
on:[year]
index:[lit]
values:[value]
sort_columns:false
agg_expr:
sep:_
结果
+---------+--------+
| lit | value |
| --- | --- |
| str | f64 |
+=========+========+
| E | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 3 |
+---------+--------+
| A | 2 |
+---------+--------+
| D | 1 |
+---------+--------+
```
## 4. 对DataFrame列进行操作的函数
#### 4.1 列重命名
```rustdfscript
函数
alias
语法
alias(col(col_name), new_col_name);
输入参数
col - 引用DataFrame列的函数
col_name - 列名
new_col_name - 新列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
new_col_name:new_name
expr:with_columns(alias(replace(col(name), James, Jordan), new_name), input_df)
结果
+---------+--------+
| lit | name | new_name |
| --- | --- | --- |
| str | str | str |
+=========+========+========+
| A | Kobe | Kobe |
+---------+--------+--------+
| B | James | Jordan |
+---------+--------+--------+
| C | Curry | Curry |
+---------+--------+--------+
```
#### 4.2 条件逻辑
```rustdfscript
函数
when_then_otherwise
语法
when_then_otherwise(condition, value_if_true, value_if_false);
输入参数
condition - 条件表达式
value_if_true - 条件成立赋值
value_if_false - 条件不成立赋值
示例
输入的input_df:
+---------+--------+
| columns | name |
| --- | --- |
| str | str |
+=========+========+
| A | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
expr:with_columns(alias(when_then_otherwise(arange(0, 3) == 0, alias(e0,columns), col(columns)), columns), input_df)
结果
+---------+--------+
| columns | name |
| --- | --- |
| str | str |
+=========+========+
| e0 | Kobe |
+---------+--------+
| B | James |
+---------+--------+
| C | Curry |
+---------+--------+
```
### 4.3 逻辑取反
```rustdfscript
函数
not
语法
not(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
表达式:with_columns(alias(not(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.4 判断非空
```rustdfscript
函数
not_null
语法
not_null(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(not_null(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.5 判断为空
```rustdfscript
函数
is_null
语法
is_null(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(is_null(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.6 判断非NaN
```rustdfscript
函数
not_nan
语法
not_nan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(not_nan(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.7 判断为NaN
```rustdfscript
函数
is_nan
语法
is_nan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(is_nab(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.8 删除空值
```rustdfscript
函数
drop_nulls
语法
drop_nulls(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | null |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(drop_nulls(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.9 删除NaN值
```rustdfscript
函数
drop_nans
语法
drop_nans(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | NaN |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(drop_nans(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.10 判断是否等于给定值
```rustdfscript
函数
eq
语法
eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
表达式:with_columns(alias(eq(col(name), 1), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.11 判断是否不等于给定值
```rustdfscript
函数
neq
语法
neq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
表达式:with_columns(alias(neq(col(name), 1), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.12 判断是否小于给定值
```rustdfscript
函数
lt
语法
lt(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(lt(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | false |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.13 判断是否大于给定值
```rustdfscript
函数
gt
语法
gt(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(gt(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | false |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.14 判断是否小于等于给定值
```rustdfscript
函数
lt_eq
语法
lt_eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(lt_eq(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | true |
+---------+--------+
| B | true |
+---------+--------+
| C | false |
+---------+--------+
```
### 4.15 判断是否大于等于给定值
```rustdfscript
函数
gt_eq
语法
gt_eq(col(col_name), value)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
value - 比较值
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(gt_eq(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | bool |
+=========+========+
| A | false |
+---------+--------+
| B | true |
+---------+--------+
| C | true |
+---------+--------+
```
### 4.16 统计唯一值数量
```rustdfscript
函数
n_unique
语法
n_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(n_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 2 |
+---------+
```
### 4.17 获取唯一值
```rustdfscript
函数
unique
语法
unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:select(alias(unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
```
### 4.18 获取唯一值索引位置
```rustdfscript
函数
arg_unique
语法
arg_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 0 |
+---------+
| 2 |
+---------+
| 3 |
+---------+
```
### 4.19 获取第1个最小值索引位置
```rustdfscript
函数
arg_min
语法
arg_min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 0 |
+---------+
```
### 4.20 获取第1个最大值索引位置
```rustdfscript
函数
arg_max
语法
arg_max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(arg_max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 2 |
+---------+
```
### 4.21 判断是否为唯一值
```rustdfscript
函数
is_unique
语法
is_unique(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(is_unique(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| bool |
+========+
| false |
+---------+
| true |
+---------+
| false |
+---------+
| false |
+---------+
| false |
+---------+
```
### 4.22 判断是否为重复值
```rustdfscript
函数
is_duplicated
语法
is_duplicated(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 1 |
+---------+--------+
| E | 3 |
+---------+--------+
表达式:select(alias(is_duplicated(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| bool |
+========+
| true |
+---------+
| false |
+---------+
| true |
+---------+
| true |
+---------+
| true |
+---------+
```
### 4.23 数据倒序
```rustdfscript
函数
reverse
语法
reverse(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
表达式:with_columns(alias(reverse(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 1 |
+---------+--------+
```
### 4.24 统计非空值数量
```rustdfscript
函数
count
语法
count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 4 |
+---------+
```
### 4.25 统计空值数量
```rustdfscript
函数
null_count
语法
null_count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(null_count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
```
### 4.26 获取长度
```rustdfscript
函数
len
语法
len(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(len(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
```
### 4.27 求和
```rustdfscript
函数
sum
语法
sum(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(sum(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 15 |
+---------+
```
### 4.28 获取最小值
```rustdfscript
函数
min
语法
min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
```
### 4.29 获取最大值
```rustdfscript
函数
max
语法
max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
```
### 4.30 计算平均值
```rustdfscript
函数
mean
语法
mean(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(mean(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 3 |
+---------+
```
### 4.31 计算中位数
```rustdfscript
函数
median
语法
median(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3.5 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(median(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 3.5 |
+---------+
```
### 4.32 计算标准差
```rustdfscript
函数
std
语法
std(col(col_name), ddof)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
ddof - 自由度,0为总体标准差,1为样本标准差
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -1 |
+---------+--------+
| D | 5 |
+---------+--------+
| E | 4 |
+---------+--------+
表达式:select(alias(std(col(name), 1), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| sqrt(5.3) |
+---------+
```
### 4.33 计算方差
```rustdfscript
函数
var
语法
var(col(col_name), ddof)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
ddof - 自由度,0为总体方差,1为样本方差
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 3 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -1 |
+---------+--------+
| D | 5 |
+---------+--------+
| E | 4 |
+---------+--------+
表达式:select(alias(var(col(name), 1), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5.3 |
+---------+
```
### 4.34 累积计数
```rustdfscript
函数
cum_count
语法
cum_count(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | null |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_count(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 2 |
+---------+
| 3 |
+---------+
| 3 |
+---------+
| 4 |
+---------+
```
### 4.35 累积求和
```rustdfscript
函数
cum_sum
语法
cum_sum(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_sum(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 3 |
+---------+
| 6 |
+---------+
| 10 |
+---------+
| 15 |
+---------+
```
### 4.36 累积最小值
```rustdfscript
函数
cum_min
语法
cum_min(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 6 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 7 |
+---------+--------+
表达式:select(alias(cum_min(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
| 4 |
+---------+
| 4 |
+---------+
| 2 |
+---------+
| 2 |
+---------+
```
### 4.37 累积最大值
```rustdfscript
函数
cum_max
语法
cum_max(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 5 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 6 |
+---------+--------+
| D | 2 |
+---------+--------+
| E | 7 |
+---------+--------+
表达式:select(alias(cum_max(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 5 |
+---------+
| 5 |
+---------+
| 6 |
+---------+
| 6 |
+---------+
| 7 |
+---------+
```
### 4.38 累积乘积
```rustdfscript
函数
cum_prod
语法
cum_prod(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:select(alias(cum_prod(col(name)), new_name), input_df)
结果
+---------+
| new_name |
| --- |
| f64 |
+========+
| 1 |
+---------+
| 2 |
+---------+
| 6 |
+---------+
| 24 |
+---------+
| 120 |
+---------+
```
### 4.39 绝对值
```rustdfscript
函数
abs
语法
abs(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
表达式:with_columns(alias(abs(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.40 幂运算
```rustdfscript
函数
pow
语法
pow(col(col_name), exponent)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
exponent - 指数
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
表达式:with_columns(alias(pow(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 9 |
+---------+--------+
| D | 16 |
+---------+--------+
| E | 25 |
+---------+--------+
```
### 4.41 自然指数
```rustdfscript
函数
exp
语法
exp(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
表达式:with_columns(alias(exp(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | e^1 |
+---------+--------+
| B | e^2 |
+---------+--------+
| C | e^3 |
+---------+--------+
| D | e^4 |
+---------+--------+
| E | e^5 |
+---------+--------+
```
### 4.42 计算ln(1+x)
```rustdfscript
函数
log1p
语法
log1p(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | e^1-1 |
+---------+--------+
| B | e^2-1 |
+---------+--------+
| C | e^3-1 |
+---------+--------+
| D | e^4-1 |
+---------+--------+
| E | e^5-1 |
+---------+--------+
表达式:with_columns(alias(log1p(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.43 对数函数
```rustdfscript
函数
log
语法
log(col(col_name), base)
输入参数
col - 引用DataFrame列的函数
col_name - 列名
base - 底数
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 4 |
+---------+--------+
| D | 8 |
+---------+--------+
| E | 16 |
+---------+--------+
表达式:with_columns(alias(log(col(name), 2), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 0 |
+---------+--------+
| B | 1 |
+---------+--------+
| C | 2 |
+---------+--------+
| D | 3 |
+---------+--------+
| E | 4 |
+---------+--------+
```
### 4.44 平方根
```rustdfscript
函数
sqrt
语法
sqrt(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 4 |
+---------+--------+
| C | 9 |
+---------+--------+
| D | 16 |
+---------+--------+
| E | 26 |
+---------+--------+
表达式:with_columns(alias(sqrt(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | 1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | 3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | 5 |
+---------+--------+
```
### 4.45 立方根
```rustdfscript
函数
cbrt
语法
cbrt(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
示例
输入的input_df:
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 8 |
+---------+--------+
| C | -27 |
+---------+--------+
| D | 64 |
+---------+--------+
| E | -125 |
+---------+--------+
表达式:with_columns(alias(cbrt(col(name)), name), input_df)
结果
+---------+--------+
| lit | name |
| --- | --- |
| str | f64 |
+=========+========+
| A | -1 |
+---------+--------+
| B | 2 |
+---------+--------+
| C | -3 |
+---------+--------+
| D | 4 |
+---------+--------+
| E | -5 |
+---------+--------+
```
### 4.46 正弦函数
```rustdfscript
函数
sin
语法
sin(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.47 余弦函数
```rustdfscript
函数
cos
语法
cos(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.48 正切函数
```rustdfscript
函数
tan
语法
tan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.49 余切函数
```rustdfscript
函数
cot
语法
cot(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.50 反正弦函数
```rustdfscript
函数
arcsin
语法
arcsin(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.51 反余弦函数
```rustdfscript
函数
arccos
语法
arccos(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.52 反正切函数
```rustdfscript
函数
arctan
语法
arctan(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.53 双曲正弦函数
```rustdfscript
函数
sinh
语法
sinh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.54 双曲余弦函数
```rustdfscript
函数
cosh
语法
cosh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.55 双曲正切函数
```rustdfscript
函数
tanh
语法
tanh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.56 反双曲正弦函数
```rustdfscript
函数
arcsinh
语法
arcsinh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.57 反双曲余弦函数
```rustdfscript
函数
arccosh
语法
arccosh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.58 反双曲正切函数
```rustdfscript
函数
arctanh
语法
arctanh(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.59 弧度转角度
```rustdfscript
函数
degrees
语法
degrees(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.60 角度转弧度
```rustdfscript
函数
radians
语法
radians(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.61 向下取整函数
```rustdfscript
函数
floor
语法
floor(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.62 向上取整函数
```rustdfscript
函数
ceil
语法
ceil(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.63 四舍五入函数
```rustdfscript
函数
round
语法
round(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.64 符号函数
```rustdfscript
函数
sign
语法
sign(col(col_name))
输入参数
col - 引用DataFrame列的函数
col_name - 列名
```
### 4.65 内积
```rustdfscript
函数
dot
语法
dot(col(col_name1), col(col_name2))
输入参数
col - 引用DataFrame列的函数
col_name - 列名1
col_name - 列名2
示例
输入的input_df:
+---------+--------+
| t1 | t2 |
| --- | --- |
| f64 | f64 |
+=========+========+
| 1 | 2 |
+---------+--------+
| 2 | 3 |
+---------+--------+
| 3 | 4 |
+---------+--------+
| 4 | 5 |
+---------+--------+
| 5 | 6 |
+---------+--------+
表达式:select(alias(dot(col(t1), col(t2)), t1), input_df)
结果
+---------+
| t1 |
| --- |
| f64 |
+=========+
| 70 |
+---------+
```
\ No newline at end of file
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论