Java Streams¶

What are Streams?¶

Streams provide a declarative way to process collections of data. They support functional-style operations and can be parallelized.

// Imperative (traditional)
List<String> filtered = new ArrayList<>();
for (String name : names) {
    if (name.startsWith("A")) {
        filtered.add(name.toUpperCase());
    }
}

// Declarative (streams)
List<String> filtered = names.stream()
    .filter(name -> name.startsWith("A"))
    .map(String::toUpperCase)
    .collect(Collectors.toList());

Stream Pipeline¶

Stream Pipeline

Key Characteristics: - Streams don't store data (unlike collections) - Streams don't modify the source - Operations are lazy (evaluated only when terminal operation is invoked) - Streams can only be consumed once

Creating Streams¶

// From Collection
List<String> list = List.of("a", "b", "c");
Stream<String> stream1 = list.stream();
Stream<String> parallel1 = list.parallelStream();

// From Array
String[] array = {"a", "b", "c"};
Stream<String> stream2 = Arrays.stream(array);
Stream<String> stream3 = Stream.of("a", "b", "c");

// From values
Stream<String> stream4 = Stream.of("x", "y", "z");

// Empty stream
Stream<String> empty = Stream.empty();

// Infinite streams
Stream<Integer> infinite1 = Stream.iterate(0, n -> n + 1);  // 0, 1, 2, 3, ...
Stream<Integer> infinite2 = Stream.iterate(0, n -> n < 100, n -> n + 1);  // Java 9+
Stream<Double> random = Stream.generate(Math::random);

// From range
IntStream range1 = IntStream.range(1, 10);       // 1-9 (exclusive end)
IntStream range2 = IntStream.rangeClosed(1, 10); // 1-10 (inclusive end)

// From String
IntStream chars = "hello".chars();  // Stream of char codes

// From file
Stream<String> lines = Files.lines(Path.of("file.txt"));

// Builder
Stream<String> built = Stream.<String>builder()
    .add("a")
    .add("b")
    .add("c")
    .build();

Intermediate Operations¶

filter() - Keep elements matching predicate¶

List<Integer> evens = numbers.stream()
    .filter(n -> n % 2 == 0)
    .collect(Collectors.toList());

// Multiple conditions
List<Person> adults = people.stream()
    .filter(p -> p.getAge() >= 18)
    .filter(p -> p.getCountry().equals("US"))
    .collect(Collectors.toList());

map() - Transform elements¶

List<String> names = people.stream()
    .map(Person::getName)
    .collect(Collectors.toList());

List<Integer> lengths = words.stream()
    .map(String::length)
    .collect(Collectors.toList());

// Chained transformations
List<String> upperNames = people.stream()
    .map(Person::getName)
    .map(String::toUpperCase)
    .collect(Collectors.toList());

flatMap() - Flatten nested structures¶

// List of lists to single list
List<List<Integer>> nested = List.of(
    List.of(1, 2),
    List.of(3, 4),
    List.of(5, 6)
);
List<Integer> flat = nested.stream()
    .flatMap(List::stream)
    .collect(Collectors.toList());  // [1, 2, 3, 4, 5, 6]

// Flatten object's collections
List<String> allPhones = people.stream()
    .flatMap(p -> p.getPhoneNumbers().stream())
    .collect(Collectors.toList());

// Split and flatten
List<String> words = lines.stream()
    .flatMap(line -> Arrays.stream(line.split(" ")))
    .collect(Collectors.toList());

sorted() - Sort elements¶

// Natural order
List<String> sorted = names.stream()
    .sorted()
    .collect(Collectors.toList());

// Custom comparator
List<Person> byAge = people.stream()
    .sorted(Comparator.comparingInt(Person::getAge))
    .collect(Collectors.toList());

// Multiple criteria
List<Person> sorted = people.stream()
    .sorted(Comparator.comparing(Person::getLastName)
        .thenComparing(Person::getFirstName)
        .thenComparingInt(Person::getAge))
    .collect(Collectors.toList());

// Reverse order
List<Person> byAgeDesc = people.stream()
    .sorted(Comparator.comparingInt(Person::getAge).reversed())
    .collect(Collectors.toList());

distinct() - Remove duplicates¶

List<Integer> unique = numbers.stream()
    .distinct()  // Uses equals()
    .collect(Collectors.toList());

limit() and skip() - Pagination¶

// First 5 elements
List<String> first5 = stream.limit(5).collect(Collectors.toList());

// Skip first 10, take next 5
List<String> page3 = stream.skip(10).limit(5).collect(Collectors.toList());

// Infinite stream with limit
List<Integer> first10 = Stream.iterate(1, n -> n + 1)
    .limit(10)
    .collect(Collectors.toList());  // [1, 2, 3, ..., 10]

peek() - Debug/side effects¶

List<String> result = names.stream()
    .filter(n -> n.length() > 3)
    .peek(n -> System.out.println("Filtered: " + n))
    .map(String::toUpperCase)
    .peek(n -> System.out.println("Mapped: " + n))
    .collect(Collectors.toList());

takeWhile() / dropWhile() (Java 9+)¶

// Take while condition is true
List<Integer> result1 = Stream.of(1, 2, 3, 4, 5, 1, 2)
    .takeWhile(n -> n < 4)
    .collect(Collectors.toList());  // [1, 2, 3]

// Drop while condition is true
List<Integer> result2 = Stream.of(1, 2, 3, 4, 5, 1, 2)
    .dropWhile(n -> n < 4)
    .collect(Collectors.toList());  // [4, 5, 1, 2]

Terminal Operations¶

collect() - Gather results¶

// To List
List<String> list = stream.collect(Collectors.toList());
List<String> list2 = stream.toList();  // Java 16+ (unmodifiable)

// To Set
Set<String> set = stream.collect(Collectors.toSet());

// To specific collection
TreeSet<String> treeSet = stream.collect(Collectors.toCollection(TreeSet::new));

// To Map
Map<Integer, String> map = people.stream()
    .collect(Collectors.toMap(Person::getId, Person::getName));

// To Map with merge function (handle duplicates)
Map<String, Integer> wordCount = words.stream()
    .collect(Collectors.toMap(
        w -> w,
        w -> 1,
        Integer::sum  // Merge function for duplicates
    ));

// To String
String joined = names.stream()
    .collect(Collectors.joining(", "));  // "a, b, c"

String csv = names.stream()
    .collect(Collectors.joining(",", "[", "]"));  // "[a,b,c]"

Grouping and Partitioning¶

// Group by single key
Map<String, List<Person>> byCity = people.stream()
    .collect(Collectors.groupingBy(Person::getCity));

// Group by with downstream collector
Map<String, Long> countByCity = people.stream()
    .collect(Collectors.groupingBy(Person::getCity, Collectors.counting()));

Map<String, Double> avgAgeByCity = people.stream()
    .collect(Collectors.groupingBy(
        Person::getCity,
        Collectors.averagingInt(Person::getAge)
    ));

// Nested grouping
Map<String, Map<String, List<Person>>> byCityAndDept = people.stream()
    .collect(Collectors.groupingBy(
        Person::getCity,
        Collectors.groupingBy(Person::getDepartment)
    ));

// Partitioning (split into two groups)
Map<Boolean, List<Person>> adultsAndMinors = people.stream()
    .collect(Collectors.partitioningBy(p -> p.getAge() >= 18));

List<Person> adults = adultsAndMinors.get(true);
List<Person> minors = adultsAndMinors.get(false);

reduce() - Aggregate to single value¶

// Sum
int sum = numbers.stream().reduce(0, Integer::sum);
Optional<Integer> sum2 = numbers.stream().reduce(Integer::sum);

// Product
int product = numbers.stream().reduce(1, (a, b) -> a * b);

// Max/Min
Optional<Integer> max = numbers.stream().reduce(Integer::max);
Optional<Integer> min = numbers.stream().reduce(Integer::min);

// Concatenate strings
String concat = strings.stream().reduce("", String::concat);

// Custom reduction
Optional<Person> oldest = people.stream()
    .reduce((p1, p2) -> p1.getAge() > p2.getAge() ? p1 : p2);

forEach() - Side effects¶

names.stream().forEach(System.out::println);
names.forEach(System.out::println);  // Shorthand on Collection

// forEachOrdered (maintains order in parallel)
names.parallelStream().forEachOrdered(System.out::println);

count(), min(), max(), sum(), average()¶

long count = stream.count();

Optional<Integer> min = numbers.stream().min(Comparator.naturalOrder());
Optional<Integer> max = numbers.stream().max(Comparator.naturalOrder());

// For primitive streams
int sum = IntStream.of(1, 2, 3, 4, 5).sum();
double avg = IntStream.of(1, 2, 3, 4, 5).average().orElse(0);
IntSummaryStatistics stats = IntStream.of(1, 2, 3, 4, 5).summaryStatistics();
// stats.getSum(), stats.getAverage(), stats.getMin(), stats.getMax(), stats.getCount()

findFirst(), findAny()¶

Optional<String> first = names.stream()
    .filter(n -> n.startsWith("A"))
    .findFirst();

// findAny is faster for parallel streams
Optional<String> any = names.parallelStream()
    .filter(n -> n.startsWith("A"))
    .findAny();

anyMatch(), allMatch(), noneMatch()¶

boolean hasAdult = people.stream().anyMatch(p -> p.getAge() >= 18);
boolean allAdults = people.stream().allMatch(p -> p.getAge() >= 18);
boolean noMinors = people.stream().noneMatch(p -> p.getAge() < 18);

toArray()¶

String[] array = names.stream().toArray(String[]::new);
Object[] objects = names.stream().toArray();

Primitive Streams¶

// IntStream, LongStream, DoubleStream - avoid boxing overhead

// Create
IntStream intStream = IntStream.of(1, 2, 3);
IntStream range = IntStream.range(1, 100);
IntStream fromArray = Arrays.stream(new int[]{1, 2, 3});

// Convert from object stream
IntStream ages = people.stream().mapToInt(Person::getAge);
LongStream ids = people.stream().mapToLong(Person::getId);
DoubleStream salaries = people.stream().mapToDouble(Person::getSalary);

// Convert to object stream
Stream<Integer> boxed = intStream.boxed();

// Specialized operations
int sum = IntStream.of(1, 2, 3, 4, 5).sum();
OptionalDouble avg = IntStream.of(1, 2, 3, 4, 5).average();
OptionalInt max = IntStream.of(1, 2, 3, 4, 5).max();
IntSummaryStatistics stats = IntStream.of(1, 2, 3, 4, 5).summaryStatistics();

Parallel Streams¶

// Create parallel stream
Stream<String> parallel = list.parallelStream();
Stream<String> parallel2 = stream.parallel();

// Convert back to sequential
Stream<String> sequential = parallel.sequential();

// Check if parallel
boolean isParallel = stream.isParallel();

// Example
long count = list.parallelStream()
    .filter(s -> s.length() > 3)
    .count();

When to Use Parallel Streams¶

Parallel Streams Guidelines

Practical Examples¶

Top N Elements¶

// Top 5 highest salaries
List<Employee> top5 = employees.stream()
    .sorted(Comparator.comparingDouble(Employee::getSalary).reversed())
    .limit(5)
    .collect(Collectors.toList());

Flatten and Deduplicate¶

// Get all unique tags from all posts
Set<String> allTags = posts.stream()
    .flatMap(post -> post.getTags().stream())
    .collect(Collectors.toSet());

Calculate Statistics¶

DoubleSummaryStatistics stats = employees.stream()
    .mapToDouble(Employee::getSalary)
    .summaryStatistics();

System.out.println("Count: " + stats.getCount());
System.out.println("Sum: " + stats.getSum());
System.out.println("Min: " + stats.getMin());
System.out.println("Max: " + stats.getMax());
System.out.println("Average: " + stats.getAverage());

Frequency Map¶

Map<String, Long> wordFrequency = words.stream()
    .collect(Collectors.groupingBy(
        Function.identity(),
        Collectors.counting()
    ));

Find Duplicates¶

Set<String> duplicates = names.stream()
    .filter(n -> Collections.frequency(names, n) > 1)
    .collect(Collectors.toSet());

// More efficient
Set<String> seen = new HashSet<>();
Set<String> duplicates2 = names.stream()
    .filter(n -> !seen.add(n))
    .collect(Collectors.toSet());

Map Transformation¶

// Transform Map values
Map<String, Integer> nameLengths = names.stream()
    .collect(Collectors.toMap(
        Function.identity(),
        String::length
    ));

// Filter Map entries
Map<String, Integer> filtered = originalMap.entrySet().stream()
    .filter(e -> e.getValue() > 10)
    .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

// Invert Map (swap keys and values)
Map<Integer, String> inverted = originalMap.entrySet().stream()
    .collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey));

Common Interview Questions¶

Stream vs Collection?
Collection stores elements, Stream processes them
Streams are lazy, Collections are eager
Streams can only be consumed once
Intermediate vs Terminal operations?
Intermediate: Return stream, lazy (filter, map, sorted)
Terminal: Return result, trigger execution (collect, forEach, reduce)
map() vs flatMap()?
map(): One-to-one transformation
flatMap(): One-to-many, flattens nested streams
When to use parallel streams?
Large datasets, CPU-intensive operations, stateless operations
Why are streams lazy?
Efficiency: Operations fused, short-circuiting possible