String Tokenizer and Splitting a string into fixed length parts


String Tokenizer


The java.util.StringTokenizer class allows you to break a string into tokens. It is simple way to break string. The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis.

StringTokenizer Split by space

import java.util.StringTokenizer;
 
public class ModifiedSimple
{
    public static void main(String args[])
    {
        StringTokenizer st = new StringTokenizer("one two three four", " ");
        while (st.hasMoreTokens())
        {
            System.out.println(st.nextToken());
        }
    }
}

Result:

one
two
three
four

StringTokenizer Split by comma ' , '

public static void main(String args[])
{
    StringTokenizer st = new StringTokenizer("hello|world,open|close", "|");
    while (st.hasMoreTokens())
    {
        System.out.println(st.nextToken());
    }
}
 

Result:

hello
world,open
close

Splitting a string into fixed length parts


The trick is to use a look-behind with the regex \G, which means "end of previous match"

String[] parts = str.split("(?<=\\G.{8})");

The regex matches 8 characters after the end of the last match. Since in this case the match is zero-width, we could more simply say "8 characters after the last match".

Conveniently, \G is initialized to start of input, so it works for the first part of the input too.

Break a string up into substrings all of variable length

Same as the known length example, but insert the length into regex:

int length = 5;
String[] parts = str.split("(?<=\\G.{" + length + "})");

Basic Programs