Backend Development 16 min read

JDFrame: A JVM‑Level DataFrame‑Like API for Simplified Java Stream Processing

This article introduces JDFrame/SDFrame, a Java library that provides a DataFrame‑style, semantic API for stream processing, covering quick start, dependency setup, extensive examples of filtering, aggregation, distinct, grouping, sorting, joining, and utility functions, along with Maven coordinates and source repository links.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
JDFrame: A JVM‑Level DataFrame‑Like API for Simplified Java Stream Processing

The author presents JDFrame (and its counterpart SDFrame), a JVM‑level DataFrame‑style tool designed to make Java 8 stream operations more expressive and concise, especially for tasks that would otherwise require verbose stream code.

0. Introduction

Motivated by the difficulty of remembering long Stream APIs and the desire for a more semantic, DataFrame‑like approach (similar to Spark or Pandas), the author created a library that abstracts common stream operations into readable methods.

1. Quick Start

1.1 Add Dependency

<dependency>
    <groupId>io.github.burukeyou</groupId>
    <artifactId>jdframe</artifactId>
    <version>0.0.2</version>
</dependency>

1.2 Example

Calculate the total score of students aged 9‑16 for each school and retrieve the top‑2 schools.

static List<Student> studentList = new ArrayList<>();
// ... populate list ...
SDFrame<FI2<String, BigDecimal>> sdf2 = SDFrame.read(studentList)
    .whereNotNull(Student::getAge)
    .whereBetween(Student::getAge, 9, 16)
    .groupBySum(Student::getSchool, Student::getScore)
    .sortDesc(FI2::getC2)
    .cutFirst(2);

sdf2.show();

Output:

c1  c2
三中 10
二中 7

2. API Cases

2.1 Matrix Information

void show(int n); // print matrix info
List
columns(); // get header names
List
col(Function
function); // get a column
T head(); // first element
List
head(int n); // first n elements
T tail(); // last element
List
tail(int n); // last n elements

2.2 Filtering

SDFrame.read(studentList)
    .whereBetween(Student::getAge, 3, 6)          // [3,6]
    .whereBetweenR(Student::getAge, 3, 6)         // (3,6]
    .whereBetweenL(Student::getAge, 3, 6)         // [3,6)
    .whereNotNull(Student::getName)
    .whereGt(Student::getAge, 3)
    .whereGe(Student::getAge, 3)
    .whereLt(Student::getAge, 3)
    .whereIn(Student::getAge, Arrays.asList(3,7,8))
    .whereNotIn(Student::getAge, Arrays.asList(3,7,8))
    .whereEq(Student::getAge, 3)
    .whereNotEq(Student::getAge, 3)
    .whereLike(Student::getName, "jay")
    .whereLikeLeft(Student::getName, "jay")
    .whereLikeRight(Student::getName, "jay");

2.3 Aggregation

JDFrame<Student> frame = JDFrame.read(studentList);
Student maxAgeStudent = frame.max(Student::getAge);
Integer maxAge = frame.maxValue(Student::getAge);
Student minAgeStudent = frame.min(Student::getAge);
Integer minAge = frame.minValue(Student::getAge);
BigDecimal avgAge = frame.avg(Student::getAge);
BigDecimal sumAge = frame.sum(Student::getAge);
MaxMin<Student> maxMinStudent = frame.maxMin(Student::getAge);
MaxMin<Integer> maxMinValue = frame.maxMinValue(Student::getAge);

2.4 Distinct

Native streams only deduplicate whole objects; JDFrame adds field‑level distinct.

List<Student> distinct = SDFrame.read(studentList).distinct().toLists();
List<Student> distinctBySchool = SDFrame.read(studentList).distinct(Student::getSchool).toLists();
List<Student> distinctByComposite = SDFrame.read(studentList).distinct(e -> e.getSchool() + e.getLevel()).toLists();

2.5 Simple Group‑by Aggregation

JDFrame<Student> frame = JDFrame.from(studentList);
List<FI2<String, BigDecimal>> sumBySchool = frame.groupBySum(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Integer>> maxBySchool = frame.groupByMaxValue(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Student>> maxObjBySchool = frame.groupByMax(Student::getSchool, Student::getAge).toLists();
List<FI2<String, Long>> countBySchool = frame.groupByCount(Student::getSchool).toLists();
List<FI2<String, BigDecimal>> avgBySchool = frame.groupByAvg(Student::getSchool, Student::getAge).toLists();
// multi‑level grouping examples omitted for brevity

2.6 Sorting

SDFrame.read(studentList).sortDesc(Student::getAge);
SDFrame.read(studentList).sortDesc(Student::getAge).sortAsc(Student::getLevel);
SDFrame.read(studentList).sortAsc(Student::getAge);
SDFrame.read(studentList).sortAsc(Comparator.comparing(e -> e.getLevel() + e.getId()));

2.7 Joining Matrices

API list includes append , union , join , leftJoin , rightJoin . Example of an inner join:

SDFrame<Student> sdf = SDFrame.read(studentList);
SDFrame<FI2<String, BigDecimal>> topSchools = /* same as earlier */;
UserInfo frame = sdf.join(topSchools,
    (a,b) -> a.getSchool().equals(b.getC1()),
    (a,b) -> {
        UserInfo ui = new UserInfo();
        ui.setKey1(a.getSchool());
        ui.setKey2(b.getC2().intValue());
        ui.setKey3(String.valueOf(a.getId()));
        return ui;
    });
frame.show(5);

2.8 Other Utilities

Percentage conversion : SDFrame.read(list).mapPercent(Student::getScore, Student::setScore, 2)

Partition : split list into sub‑lists of a given size.

Generate sequence numbers and ranking numbers based on sorted order.

Replenish missing entries for dimensions such as schools or grades.

Final Notes

The library provides two frames: SDFrame (lazy, similar to native streams) and JDFrame (eager, operations take effect immediately). Choose SDFrame for simple one‑pass stream processing; use JDFrame when intermediate results are needed.

Source code: https://github.com/burukeYou/JDFrame

Maven coordinates: https://central.sonatype.com/artifact/io.github.burukeyou/jdframe

BackendData ProcessingAPIStreamdataframejdframesdframe
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.