Sunday, August 27, 2017

Java: Enforcing a minimum number of method arguments using varargs

Sometimes you need to write a method that requires at least a certain number of arguments. For example, let's say you have a calculation that requires three or more integer arguments. You could write it like this:

public static int calculate(int... args) {
  if (args.length < 3) {
    throw new IllegalArgumentException("At least 3 arguments expected");
  // do something
  return result;

The problem with this is that the client cannot tell how many arguments are required and, if they are pass in too few arguments, the method will fail at runtime rather than compile time. You also have to explicity validate the number of arguments passed into the method.

A better way is to declare the method to take 3 normal arguments and 1 varargs arguments, like this:

public static int calculate(int a, int b, int c, int... rest) {
  // do something
  return result;

Saturday, July 29, 2017

Java 8: Multimaps

A multimap is a Map which maps a single key to multiple values e.g. HashMap<String, List<String>>.

Java 8 introduces a Map.computeIfAbsent method, which makes inserting values into a multimap much simpler:

Map<String, List<String>> multimap = new HashMap<>();
multimap.computeIfAbsent(key, k -> new ArrayList<>()).add(value);

Prior to Java 8, a multimap was usually created as follows:

// create the map
Map<String, List<String>> multimap = new HashMap<>();

// put a key/value into the map
List<String> list = multimap.get(key);
if (list == null) {
  multimap.put(key, list = new ArrayList<>());

Or with Guava's Multimap class:

ListMultimap<String, String> multimap = ArrayListMultimap.create();
multimap.put(key, value);

You can read more about Java 8 updates to the Map class in my previous blog post.

Saturday, July 22, 2017

Java: Splitting a Pipe-delimited String - Fast!

This post shows how you can efficiently split a pipe-delimited string e.g. "foo|bar|baz". There are many ways to do this - I could even write my own - but I will only use those that are available in the JDK (or commonly used libraries) and will measure the performance of each.

Remember that, since the pipe symbol (|) is a special character in regular expressions, it needs to be escaped if necessary.

1. String.split

The most obvious way to split a string on the pipe character is to use Java's String.split:

public static String[] split(String s) {
  return s.split("\\|");

2. String.split with Pattern.quote

Instead of escaping the pipe ourselves, we can use Pattern.quote to do it for us. (Note: Pattern.quote("|") returns "\Q|\E".)

public static String[] splitWithPatternQuote(String s) {
  return s.split(Pattern.quote("|"));

3. Pattern.split

Create a static Pattern and use it to split the string.

private static final Pattern SPLITTER = Pattern.compile("\\|");

public static String[] splitWithPattern(String s) {
  return SPLITTER.split(s);

4. StringUtils.split

Apache Commons provides StringUtils.split, which splits a string on a single character:

import org.apache.commons.lang3.StringUtils;

public static String[] splitWithStringUtils(String s) {
  return StringUtils.split(s, '|');

So, which one is fastest?

I ran each method on 1 million pipe-delimited strings of different lengths - RandomStringUtils.randomAlphabetic is great for generating random strings - and the table below shows how long each one took:

MethodTime (ms)

An interesting observation is that splitWithPatternQuote is so much slower than split, even though they both call String.split internally! If we delve into the source code for String.split, we can see that there is an optimisation (a "fastpath") if the provided regex has two-chars and the first char is a backslash. This applies to "\\|" but, since Pattern.quote produces \Q|\E, it does not use the fastpath and instead creates a new Pattern object for every split. This also explains why it is slower than splitWithPattern, which re-uses the same Pattern object.

Monday, May 29, 2017

Stack Overflow - 150,000 rep reached!

I've been a bit quiet on Stack Overflow lately, but just managed to reach 150,000 reputation!

For me, Stack Overflow has not simply been a quest for reputation, but more about learning and helping fellow programmers in need.

Sunday, May 21, 2017

Shell Scripting: <, << and <<<

This post shows the difference between the three standard input redirection operators: <, << and <<<.

1. <

< allows you to pass data contained in a file to the standard input of a command. The format is:

command < file


# replace comma with new-lines:
tr ',' '\n' < file

# read a file line-by-line and do something with each line:
while IFS= read -r line; do
  # do something with each line
  ssh "$line" who -b
done < hosts.txt

2. <<

<< allows you to pass multi-line text (called a here-document) to a command until a specific "token" word is reached. The format is:

command << TOKEN
line 1
line 2

For example:

# query a database
sqsh -S server -D db << END
select * from table
where foo='bar'

3. <<<

<<< allows you to pass a string of text (called a here-string) to a command. If you find yourself echo-ing text and piping it to a command, you should use a here-string instead. (See my previous post on Useless Use of Echo.) The format is:

command <<< text


tr ',' '\n' <<< "foo,bar,baz"

grep "foo" <<< "foo,bar,baz"

mailx -s subject $USER <<< "email body"