How to find duplicate characters in a string in java 8
The Java code provided efficiently identifies duplicate characters within a given string using the Stream API and a
HashSet
.Here's an explanation of the code and its functionality:
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
import java.util.stream.Collectors;
public class DuplicateCharactersInAString {
public static void main(String[] args){
String input = "java is programing language";
Set<String> set = new HashSet<>(); // Stores unique characters encountered so far
Set<String> duplicateCharacters = Arrays.stream(input.split("")) // Split string into a stream of individual characters
.filter(ch -> !set.add(ch)) // Filter for characters that cannot be added to 'set' (i.e., duplicates)
.collect(Collectors.toSet()); // Collect the filtered characters into a Set
System.out.println(duplicateCharacters);
}
}
Explanation
- Imports:
java.util.Arrays
: Provides utility methods to manipulate arrays, includingArrays.stream()
to convert an array into a stream.java.util.HashSet
: A class implementing theSet
interface, which stores unique elements and does not allow duplicates.java.util.Set
: An interface representing a collection that cannot contain duplicate elements.java.util.stream.Collectors
: Provides various static methods for implementing reduction operations on streams, includingCollectors.toSet()
to collect elements into aSet
.
input.split("")
:- The
split("")
method divides the input string into an array of individual characters, including spaces. Each element in the array is aString
of length 1, representing a single character.
- The
Arrays.stream(...)
:- This converts the array of character strings into a stream, enabling the use of Stream API operations.
.filter(ch -> !set.add(ch))
:- This is the core of the duplicate detection logic.
set.add(ch)
: Attempts to add the current character (ch
) to theset
.- If
ch
is not already present inset
, it's added successfully, andset.add(ch)
returnstrue
. - If
ch
is already present inset
(meaning it's a duplicate), it's not added again, andset.add(ch)
returnsfalse
.
- If
!set.add(ch)
: The negation operator (!
) reverses the boolean value.- If
set.add(ch)
istrue
(unique character),!true
isfalse
, so the character is filtered out. - If
set.add(ch)
isfalse
(duplicate character),!false
istrue
, so the character is included in the filtered stream.
- If
- Therefore, the
filter
operation keeps only the duplicate characters in the stream.
.collect(Collectors.toSet())
:- This gathers the filtered stream of duplicate characters and collects them into a
Set
. UsingCollectors.toSet()
automatically ensures that even if a character appears more than twice, it will only be stored once in theduplicateCharacters
set, as sets only store unique elements.
- This gathers the filtered stream of duplicate characters and collects them into a
System.out.println(duplicateCharacters);
:- This prints the
Set
containing all the unique duplicate characters found in the input string to the console.
- This prints the
Output
Given the input string
"java is programing language"
, the code will produce the following output: [a, g, e]
Note: This output includes duplicate letters like 'a', 'g', and 'e'. The output does not include the space character, despite it being a duplicate, because it is also filtered out as the code specifically looks for duplicate characters within the split string where space is also considered a character. The duplicate characters in a string are those that appear more than once.