Simplify Java Stream Operations with JDFrame: A Semantic DataFrame API
This article introduces JDFrame/SDFrame, a JVM‑level DataFrame library that provides a more semantic and concise API for Java 8 stream processing, showcases quick start steps, detailed API categories such as filtering, aggregation, grouping, sorting, joining, and explains the differences between SDFrame and JDFrame with practical code examples.
Author "Architect" introduces a JVM‑level DataFrame tool that offers a semantic and simplified API for Java 8 stream processing, inspired by DataFrame models like Spark, tablesaw, and joinery.
1. Quick Start
1.1 Add Dependency
<dependency>
<groupId>io.github.burukeyou</groupId>
<artifactId>jdframe</artifactId>
<version>0.0.4</version>
</dependency>1.2 Example
Calculate the total score of each school for students whose age is not null and between 9 and 16, then retrieve the top two schools.
static List<Student> studentList = new ArrayList<>();
static {
studentList.add(new Student(1, "a", "一中", "一年级", 11, new BigDecimal(1)));
studentList.add(new Student(2, "a", "一中", "一年级", 11, new BigDecimal(1)));
studentList.add(new Student(3, "b", "一中", "三年级", 12, new BigDecimal(2)));
studentList.add(new Student(4, "c", "二中", "一年级", 13, new BigDecimal(3)));
studentList.add(new Student(5, "d", "二中", "一年级", 14, new BigDecimal(4)));
studentList.add(new Student(6, "e", "三中", "二年级", 14, new BigDecimal(5)));
studentList.add(new Student(7, "e", "三中", "二年级", 15, new BigDecimal(5)));
}
// Equivalent SQL:
// select school, sum(score) from students
// where age is not null and age >=9 and age <=16
// group by school order by sum(score) desc limit 2
SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList)
.whereNotNull(Student::getAge)
.whereBetween(Student::getAge, 9, 16)
.groupBySum(Student::getSchool, Student::getScore)
.sortDesc(FI2::getC2)
.cutFirst(2);
sdf2.show();Output:
c1 c2
三中 10
二中 7
一中 42. API Cases
2.1 Matrix View
void show(int n); // print matrix
List<String> columns(); // get column names
List<R> col(Function<T,R> f); // get a column
T head(); // first element
List<T> head(int n); // first n elements
T tail(); // last element
List<T> tail(int n); // last n elements
List<T> page(int page, int pageSize); // pagination2.2 Filtering
SDFrame.read(studentList)
.whereBetween(Student::getAge, 3, 6) // [3,6]
.whereBetweenR(Student::getAge, 3, 6) // (3,6]
.whereBetweenL(Student::getAge, 3, 6) // [3,6)
.whereNotNull(Student::getName) // name not null
.whereGt(Student::getAge, 3) // >3
.whereGe(Student::getAge, 3) // >=3
.whereLt(Student::getAge, 3) // <3
.whereIn(Student::getAge, Arrays.asList(3,7,8))
.whereNotIn(Student::getAge, Arrays.asList(3,7,8))
.whereEq(Student::getAge, 3)
.whereNotEq(Student::getAge, 3)
.whereLike(Student::getName, "jay")
.whereLikeLeft(Student::getName, "jay")
.whereLikeRight(Student::getName, "jay");2.3 Aggregation
JDFrame<Student> frame = JDFrame.read(studentList);
Student maxAgeStudent = frame.max(Student::getAge);
Integer maxAge = frame.maxValue(Student::getAge);
Student minAgeStudent = frame.min(Student::getAge);
Integer minAge = frame.minValue(Student::getAge);
BigDecimal avgAge = frame.avg(Student::getAge);
BigDecimal sumAge = frame.sum(Student::getAge);
MaxMin<Student> maxMinStudent = frame.maxMin(Student::getAge);
MaxMin<Integer> maxMinValue = frame.maxMinValue(Student::getAge);2.4 Distinct
List<Student> distinctByObject = SDFrame.read(studentList).distinct().toLists();
List<Student> distinctBySchool = SDFrame.read(studentList).distinct(Student::getSchool).toLists();
List<Student> distinctBySchoolAndLevel = SDFrame.read(studentList).distinct(e -> e.getSchool() + e.getLevel()).toLists();
List<Student> multiDistinct = SDFrame.read(studentList).distinct(Student::getSchool).distinct(Student::getLevel).toLists();2.5 Group & Aggregate (SQL‑like)
List<FI2<String, BigDecimal>> sumBySchool = JDFrame.from(studentList)
.groupBySum(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Integer>> maxBySchool = JDFrame.from(studentList)
.groupByMaxValue(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Student>> maxObjBySchool = JDFrame.from(studentList)
.groupByMax(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Integer>> minBySchool = JDFrame.from(studentList)
.groupByMinValue(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Long>> countBySchool = JDFrame.from(studentList)
.groupByCount(Student::getSchool).toLists();
List<FI2<String, BigDecimal>> avgBySchool = JDFrame.from(studentList)
.groupByAvg(Student::getSchool, Student::getAge).toLists();
List<FI3<String, BigDecimal, Long>> sumCountBySchool = JDFrame.from(studentList)
.groupBySumCount(Student::getSchool, Student::getAge).toLists();
// two‑level grouping
List<FI3<String, String, BigDecimal>> sumBySchoolLevel = JDFrame.from(studentList)
.groupBySum(Student::getSchool, Student::getLevel, Student::getAge).toLists();
// three‑level grouping
List<FI4<String, String, String, BigDecimal>> sumBySchoolLevelName = JDFrame.from(studentList)
.groupBySum(Student::getSchool, Student::getLevel, Student::getName, Student::getAge).toLists();2.6 Sorting
// order by age desc
SDFrame.read(studentList).sortDesc(Student::getAge);
// multi‑level: age desc, level asc
SDFrame.read(studentList).sortAsc(Sorter.sortDescBy(Student::getAge).sortAsc(Student::getLevel));
// order by age asc
SDFrame.read(studentList).sortAsc(Student::getAge);
// using Comparator
SDFrame.read(studentList).sortAsc(Comparator.comparing(e -> e.getLevel() + e.getId()));2.7 Join (Matrix Connection)
append(T t); // like List.add
union(IFrame<T> other); // like List.addAll
join(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // inner join
leftJoin(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // left join
rightJoin(IFrame<K> other, JoinOn<T,K> on, Join<T,K,R> join); // right joinExample of inner join:
System.out.println("======== Matrix1 =======");
SDFrame<Student> sdf = SDFrame.read(studentList);
sdf.show(20);
SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList)
.whereNotNull(Student::getAge)
.whereBetween(Student::getAge, 9, 16)
.groupBySum(Student::getSchool, Student::getScore)
.sortDesc(FI2::getC2)
.cutFirst(10);
System.out.println("======== Matrix2 =======");
sdf2.show();
JDFrame<UserInfo> frame = sdf.join(sdf2,
(a,b) -> a.getSchool().equals(b.getC1()),
(a,b) -> {
UserInfo ui = new UserInfo();
ui.setKey1(a.getSchool());
ui.setKey2(b.getC2().intValue());
ui.setKey3(String.valueOf(a.getId()));
return ui;
});
System.out.println("======== Joined Result =======");
frame.show(5);2.8 Cutting
cutFirst(int n); // first n rows
cutLast(int n); // last n rows
cut(Integer start, Integer end); // sub‑list like List.subList
cutPage(int page, int pageSize); // pagination
cutFirstRank(Sorter<T> sorter, int n); // top n by rank2.9 Frame Parameter Settings
defaultScale(int scale, RoundingMode roundingMode); // set default decimal precision2.10 Other Utilities
Percentage conversion: SDFrame.read(list).mapPercent(getScore, setScore, 2) Partition: split list into sub‑lists of a given size
Add row number column based on ordering
Replenish missing dimension entries (e.g., missing schools or grades) using a custom function
3. Window Functions
JDFrame also supports programmable window functions; see the tutorial at https://juejin.cn/post/7367306429054959631 .
Final Notes
Code repository: https://github.com/burukeYou/JDFrame
Maven coordinates: https://central.sonatype.com/artifact/io.github.burukeyou/jdframe
JDFrame provides two frames with identical APIs: SDFrame behaves like a lazy Java Stream (operations take effect only on terminal actions and require a new read for each stage), while JDFrame updates immediately, allowing intermediate results to be reused without re‑reading. Use SDFrame for simple one‑pass stream processing and JDFrame when you need to pause, inspect, or branch the data flow.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
