Skip to content

Java Streams

What are Streams?

Streams provide a declarative way to process collections of data. They support functional-style operations and can be parallelized.

// Imperative (traditional)
List<String> filtered = new ArrayList<>();
for (String name : names) {
    if (name.startsWith("A")) {
        filtered.add(name.toUpperCase());
    }
}

// Declarative (streams)
List<String> filtered = names.stream()
    .filter(name -> name.startsWith("A"))
    .map(String::toUpperCase)
    .collect(Collectors.toList());

Stream Pipeline

Stream Pipeline

Key Characteristics: - Streams don't store data (unlike collections) - Streams don't modify the source - Operations are lazy (evaluated only when terminal operation is invoked) - Streams can only be consumed once


Creating Streams

// From Collection
List<String> list = List.of("a", "b", "c");
Stream<String> stream1 = list.stream();
Stream<String> parallel1 = list.parallelStream();

// From Array
String[] array = {"a", "b", "c"};
Stream<String> stream2 = Arrays.stream(array);
Stream<String> stream3 = Stream.of("a", "b", "c");

// From values
Stream<String> stream4 = Stream.of("x", "y", "z");

// Empty stream
Stream<String> empty = Stream.empty();

// Infinite streams
Stream<Integer> infinite1 = Stream.iterate(0, n -> n + 1);  // 0, 1, 2, 3, ...
Stream<Integer> infinite2 = Stream.iterate(0, n -> n < 100, n -> n + 1);  // Java 9+
Stream<Double> random = Stream.generate(Math::random);

// From range
IntStream range1 = IntStream.range(1, 10);       // 1-9 (exclusive end)
IntStream range2 = IntStream.rangeClosed(1, 10); // 1-10 (inclusive end)

// From String
IntStream chars = "hello".chars();  // Stream of char codes

// From file
Stream<String> lines = Files.lines(Path.of("file.txt"));

// Builder
Stream<String> built = Stream.<String>builder()
    .add("a")
    .add("b")
    .add("c")
    .build();

Intermediate Operations

filter() - Keep elements matching predicate

List<Integer> evens = numbers.stream()
    .filter(n -> n % 2 == 0)
    .collect(Collectors.toList());

// Multiple conditions
List<Person> adults = people.stream()
    .filter(p -> p.getAge() >= 18)
    .filter(p -> p.getCountry().equals("US"))
    .collect(Collectors.toList());

map() - Transform elements

List<String> names = people.stream()
    .map(Person::getName)
    .collect(Collectors.toList());

List<Integer> lengths = words.stream()
    .map(String::length)
    .collect(Collectors.toList());

// Chained transformations
List<String> upperNames = people.stream()
    .map(Person::getName)
    .map(String::toUpperCase)
    .collect(Collectors.toList());

flatMap() - Flatten nested structures

// List of lists to single list
List<List<Integer>> nested = List.of(
    List.of(1, 2),
    List.of(3, 4),
    List.of(5, 6)
);
List<Integer> flat = nested.stream()
    .flatMap(List::stream)
    .collect(Collectors.toList());  // [1, 2, 3, 4, 5, 6]

// Flatten object's collections
List<String> allPhones = people.stream()
    .flatMap(p -> p.getPhoneNumbers().stream())
    .collect(Collectors.toList());

// Split and flatten
List<String> words = lines.stream()
    .flatMap(line -> Arrays.stream(line.split(" ")))
    .collect(Collectors.toList());

sorted() - Sort elements

// Natural order
List<String> sorted = names.stream()
    .sorted()
    .collect(Collectors.toList());

// Custom comparator
List<Person> byAge = people.stream()
    .sorted(Comparator.comparingInt(Person::getAge))
    .collect(Collectors.toList());

// Multiple criteria
List<Person> sorted = people.stream()
    .sorted(Comparator.comparing(Person::getLastName)
        .thenComparing(Person::getFirstName)
        .thenComparingInt(Person::getAge))
    .collect(Collectors.toList());

// Reverse order
List<Person> byAgeDesc = people.stream()
    .sorted(Comparator.comparingInt(Person::getAge).reversed())
    .collect(Collectors.toList());

distinct() - Remove duplicates

List<Integer> unique = numbers.stream()
    .distinct()  // Uses equals()
    .collect(Collectors.toList());

limit() and skip() - Pagination

// First 5 elements
List<String> first5 = stream.limit(5).collect(Collectors.toList());

// Skip first 10, take next 5
List<String> page3 = stream.skip(10).limit(5).collect(Collectors.toList());

// Infinite stream with limit
List<Integer> first10 = Stream.iterate(1, n -> n + 1)
    .limit(10)
    .collect(Collectors.toList());  // [1, 2, 3, ..., 10]

peek() - Debug/side effects

List<String> result = names.stream()
    .filter(n -> n.length() > 3)
    .peek(n -> System.out.println("Filtered: " + n))
    .map(String::toUpperCase)
    .peek(n -> System.out.println("Mapped: " + n))
    .collect(Collectors.toList());

takeWhile() / dropWhile() (Java 9+)

// Take while condition is true
List<Integer> result1 = Stream.of(1, 2, 3, 4, 5, 1, 2)
    .takeWhile(n -> n < 4)
    .collect(Collectors.toList());  // [1, 2, 3]

// Drop while condition is true
List<Integer> result2 = Stream.of(1, 2, 3, 4, 5, 1, 2)
    .dropWhile(n -> n < 4)
    .collect(Collectors.toList());  // [4, 5, 1, 2]

Terminal Operations

collect() - Gather results

// To List
List<String> list = stream.collect(Collectors.toList());
List<String> list2 = stream.toList();  // Java 16+ (unmodifiable)

// To Set
Set<String> set = stream.collect(Collectors.toSet());

// To specific collection
TreeSet<String> treeSet = stream.collect(Collectors.toCollection(TreeSet::new));

// To Map
Map<Integer, String> map = people.stream()
    .collect(Collectors.toMap(Person::getId, Person::getName));

// To Map with merge function (handle duplicates)
Map<String, Integer> wordCount = words.stream()
    .collect(Collectors.toMap(
        w -> w,
        w -> 1,
        Integer::sum  // Merge function for duplicates
    ));

// To String
String joined = names.stream()
    .collect(Collectors.joining(", "));  // "a, b, c"

String csv = names.stream()
    .collect(Collectors.joining(",", "[", "]"));  // "[a,b,c]"

Grouping and Partitioning

// Group by single key
Map<String, List<Person>> byCity = people.stream()
    .collect(Collectors.groupingBy(Person::getCity));

// Group by with downstream collector
Map<String, Long> countByCity = people.stream()
    .collect(Collectors.groupingBy(Person::getCity, Collectors.counting()));

Map<String, Double> avgAgeByCity = people.stream()
    .collect(Collectors.groupingBy(
        Person::getCity,
        Collectors.averagingInt(Person::getAge)
    ));

// Nested grouping
Map<String, Map<String, List<Person>>> byCityAndDept = people.stream()
    .collect(Collectors.groupingBy(
        Person::getCity,
        Collectors.groupingBy(Person::getDepartment)
    ));

// Partitioning (split into two groups)
Map<Boolean, List<Person>> adultsAndMinors = people.stream()
    .collect(Collectors.partitioningBy(p -> p.getAge() >= 18));

List<Person> adults = adultsAndMinors.get(true);
List<Person> minors = adultsAndMinors.get(false);

reduce() - Aggregate to single value

// Sum
int sum = numbers.stream().reduce(0, Integer::sum);
Optional<Integer> sum2 = numbers.stream().reduce(Integer::sum);

// Product
int product = numbers.stream().reduce(1, (a, b) -> a * b);

// Max/Min
Optional<Integer> max = numbers.stream().reduce(Integer::max);
Optional<Integer> min = numbers.stream().reduce(Integer::min);

// Concatenate strings
String concat = strings.stream().reduce("", String::concat);

// Custom reduction
Optional<Person> oldest = people.stream()
    .reduce((p1, p2) -> p1.getAge() > p2.getAge() ? p1 : p2);

forEach() - Side effects

names.stream().forEach(System.out::println);
names.forEach(System.out::println);  // Shorthand on Collection

// forEachOrdered (maintains order in parallel)
names.parallelStream().forEachOrdered(System.out::println);

count(), min(), max(), sum(), average()

long count = stream.count();

Optional<Integer> min = numbers.stream().min(Comparator.naturalOrder());
Optional<Integer> max = numbers.stream().max(Comparator.naturalOrder());

// For primitive streams
int sum = IntStream.of(1, 2, 3, 4, 5).sum();
double avg = IntStream.of(1, 2, 3, 4, 5).average().orElse(0);
IntSummaryStatistics stats = IntStream.of(1, 2, 3, 4, 5).summaryStatistics();
// stats.getSum(), stats.getAverage(), stats.getMin(), stats.getMax(), stats.getCount()

findFirst(), findAny()

Optional<String> first = names.stream()
    .filter(n -> n.startsWith("A"))
    .findFirst();

// findAny is faster for parallel streams
Optional<String> any = names.parallelStream()
    .filter(n -> n.startsWith("A"))
    .findAny();

anyMatch(), allMatch(), noneMatch()

boolean hasAdult = people.stream().anyMatch(p -> p.getAge() >= 18);
boolean allAdults = people.stream().allMatch(p -> p.getAge() >= 18);
boolean noMinors = people.stream().noneMatch(p -> p.getAge() < 18);

toArray()

String[] array = names.stream().toArray(String[]::new);
Object[] objects = names.stream().toArray();

Primitive Streams

// IntStream, LongStream, DoubleStream - avoid boxing overhead

// Create
IntStream intStream = IntStream.of(1, 2, 3);
IntStream range = IntStream.range(1, 100);
IntStream fromArray = Arrays.stream(new int[]{1, 2, 3});

// Convert from object stream
IntStream ages = people.stream().mapToInt(Person::getAge);
LongStream ids = people.stream().mapToLong(Person::getId);
DoubleStream salaries = people.stream().mapToDouble(Person::getSalary);

// Convert to object stream
Stream<Integer> boxed = intStream.boxed();

// Specialized operations
int sum = IntStream.of(1, 2, 3, 4, 5).sum();
OptionalDouble avg = IntStream.of(1, 2, 3, 4, 5).average();
OptionalInt max = IntStream.of(1, 2, 3, 4, 5).max();
IntSummaryStatistics stats = IntStream.of(1, 2, 3, 4, 5).summaryStatistics();

Parallel Streams

// Create parallel stream
Stream<String> parallel = list.parallelStream();
Stream<String> parallel2 = stream.parallel();

// Convert back to sequential
Stream<String> sequential = parallel.sequential();

// Check if parallel
boolean isParallel = stream.isParallel();

// Example
long count = list.parallelStream()
    .filter(s -> s.length() > 3)
    .count();

When to Use Parallel Streams

Parallel Streams Guidelines


Practical Examples

Top N Elements

// Top 5 highest salaries
List<Employee> top5 = employees.stream()
    .sorted(Comparator.comparingDouble(Employee::getSalary).reversed())
    .limit(5)
    .collect(Collectors.toList());

Flatten and Deduplicate

// Get all unique tags from all posts
Set<String> allTags = posts.stream()
    .flatMap(post -> post.getTags().stream())
    .collect(Collectors.toSet());

Calculate Statistics

DoubleSummaryStatistics stats = employees.stream()
    .mapToDouble(Employee::getSalary)
    .summaryStatistics();

System.out.println("Count: " + stats.getCount());
System.out.println("Sum: " + stats.getSum());
System.out.println("Min: " + stats.getMin());
System.out.println("Max: " + stats.getMax());
System.out.println("Average: " + stats.getAverage());

Frequency Map

Map<String, Long> wordFrequency = words.stream()
    .collect(Collectors.groupingBy(
        Function.identity(),
        Collectors.counting()
    ));

Find Duplicates

Set<String> duplicates = names.stream()
    .filter(n -> Collections.frequency(names, n) > 1)
    .collect(Collectors.toSet());

// More efficient
Set<String> seen = new HashSet<>();
Set<String> duplicates2 = names.stream()
    .filter(n -> !seen.add(n))
    .collect(Collectors.toSet());

Map Transformation

// Transform Map values
Map<String, Integer> nameLengths = names.stream()
    .collect(Collectors.toMap(
        Function.identity(),
        String::length
    ));

// Filter Map entries
Map<String, Integer> filtered = originalMap.entrySet().stream()
    .filter(e -> e.getValue() > 10)
    .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

// Invert Map (swap keys and values)
Map<Integer, String> inverted = originalMap.entrySet().stream()
    .collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey));

Common Interview Questions

  1. Stream vs Collection?
  2. Collection stores elements, Stream processes them
  3. Streams are lazy, Collections are eager
  4. Streams can only be consumed once

  5. Intermediate vs Terminal operations?

  6. Intermediate: Return stream, lazy (filter, map, sorted)
  7. Terminal: Return result, trigger execution (collect, forEach, reduce)

  8. map() vs flatMap()?

  9. map(): One-to-one transformation
  10. flatMap(): One-to-many, flattens nested streams

  11. When to use parallel streams?

  12. Large datasets, CPU-intensive operations, stateless operations

  13. Why are streams lazy?

  14. Efficiency: Operations fused, short-circuiting possible