After the release of jdk8, a lot of developers wondered what would be the reason of using Scala. Especially why learning scala since they already know java. Sometimes even in job interviews I am asked this question.
For me the answer is simple. I know scala really well with more than 4 years of commercial experience and a few open source projects of my own written in scala. I know java really really well with more than 11 years on the job. If you ask me which one would I choose, it is definitely Scala. Here are some of the reasons why....
(all examples can be found at
https://github.com/kostaskougios/scalavsjdk8 )
- With scala I can build a new language-feature without having to wait for a new language release. I.e. with jdk7 the try-with-resources was introduced. But that can be implemented as a library with scala.
So here is the try-with-resources using java 7:
int data;
try (FileInputStream in = new FileInputStream("/tmp/x")) {
// do something with the stream
data = in.read();
} catch (IOException e) {
throw new RuntimeException(e);
}
// .. do something with data
System.out.println(data);
Now we can easily build a scala library to implement this jdk7 feature:
object TryWithResources
{
def tryWithResources[T <: Closeable, R](closeable: T)(f: T => R): R = try {
f(closeable)
} finally {
closeable.close()
}
}
import TryWithResources._
val data = tryWithResources(new FileInputStream("/tmp/x"))(_.read)
println(data)
If you have also noticed, usage of our scala try-with-resources is a lot simpler than the jdk7 one! It also returns a result, avoiding the weird "data" variable handling in the java code.
- Type inference for variables, method return types, generics, when working with functions etc.
- Everything is a function and returns a value. Notice in TryWitgResources above that the try block returns the value of f(closeable). Even if blocks are returning values:
val x = if ( ... ) 1 else 2
- Along with implicits, you can do some nice api's, i.e. with scala we can do List(1,2).sum or List(1.5,2.5).sum but we can't do List("a","b").sum . So, numeric lists can be summed up where as lists of strings can't. In java we would have to mapToInt() or mapToDouble() before summing.
Java sum of numbers:
List<Integer> numbers = Arrays.asList(1, 2, 3);
// summing via reduce: the intention of the code is not obvious
Integer sum = numbers.stream().reduce(0, (a, b) -> a + b);
// or this strange bit of code
int sum1 = numbers.stream().mapToInt(x -> x).sum();
// we rarelly can use the following code as most
// of the time our data will be part of a collection.
int sum2 = Stream.of(1, 2, 3).mapToInt(x -> x).sum();
Scala sum of numbers:
// scala summing of numbers is much better.
// it is achived via implicit num: Numeric[B] which
// is passed on automatically by the compiler.
val sum = List(1, 2, 3).sum
The above also is reflected into libraries and api's, i.e. the spark api for scala is richer than java, i.e. for tuples and numbers. The java RDD's have to be treated with special methods, i.e. mapToPair before using functions that require pairs.
- Scala was build with immutability in mind. Scala's default collections are immutable and we always transform a collection to an other without modifying the original collection. This is true for even Map's where we can add a key->value mapping without affecting the original map. Instead we get an instance of a new map. This means that we can construct a Map of 1 or 10000 items and use it without making defensive copies or having to make them unmodifiable etc. This really helps further down to build immutable domain models without having to take "effective java"'s advices for defensive copying during object construction or when returning internal state. And are thread safe too.
class MyClass(m:Map[String,String],l:List[String])
The above class is immutable and doesn't need to do defensive copies of m or l because the code constructing it can't modify m or l.
- There are immutable collections in java two, i.e. guava's ImmutableX. But the rest of java's ecosystem uses the mutable ones. And even the immutable ones have methods like remove() which throws exceptions. If you pass those to an other library, you can't be sure if it will work or fail with an exception. Also most of the time defensive copies have to be created,i.e.
public class MyClass {
private Map<String,String> m;
public MyClass(Map<String,String> m){
this.m=ImmutableMap.of(m);
}
}
- Streams: in java to do a mapping, you need to l.stream().map(x->x*2).collect(Collectors.toList()) . That's a lot of boilerplate and the intent of the code is lost in it. In scala it is just l.map(_*2) .
- Java streams are not as flexible as scala collections (anyone who has worked with both knows this). I often struggle to do a collection manipulation with streams where as in scala it would be 1-2 lines of code.
- Scala was build with functions in mind. In java functions were added later on (much much later on). And though java implementation of functions - all things considered including backwards compatibility - is quite good, still it is behind scala's support for functions, i.e. scala supports typed functions with up to 23 typed parameters.
- Legacy code: java's ecosystem is littered with it and it is really hard to avoid mutable java-beans (try it with hibernate, xml processing, json processing and so on and you'll hit a wall pretty soon). On the other hand, there is now pretty much any scala library I need where I can use thread-safe immutable data structures. An other example is that, with java code we got several classes for date and time manipulation, java.util.Date, Calendar, java.sql.Date, joda DateTime (my preferred one) and jdk8's LocalDateTime. Legacy classes like Date and Calendar will be with us for many more years. Legacy for constructing objects i.e. new LinkedList() instead of LinkedList.empty(). And even now a lot of things to simplify i.e. collection creation are missing from the jdk, i.e a way to create a list of 3 items or a map of 2 key/value pairs.
- Due to missing language features, a lot of java libraries are based on reflection and the java bean notation. I.e. json libraries use reflection to convert objects to json and vice versa. In scala there are libraries that use macros to do the same during compilation, resulting in much faster (and typesafe) code.
- Scala's case classes implement in 1 single line a lot of "effective java"'s advices regarding immutability, equals + hashCode, builder, prototype pattern, static factory methods (instead of new X) and toString(). It is a wonder why a feature like that is not available in Java (not even planned for jdk 9, so if they ever do it, it will be after 2020). On top of that, case classes are used for pattern matching. Which again is not available in java. See my git repo at https://github.com/kostaskougios/domainmodel_bestpractice for an example of best practices for domain models in java.
- Scala is more type safe than Java. And writing classes is cheap in scala which ends up with programmers expressing concepts with classes rather than sticking to - say - Strings. I.e. who ever wrote a java class for IP? case class IP(ip:String) and having the extra ability to pattern match against it's parts with ip match { case IP(part1,part2,part3,part4) => ... }. And have type-safe methods that take IP as an argument (instead of a String).
- Scala's tuples. Yes there are java impl of them but not part of the jdk and harder to use. Especially within functional code it is much easier to work with scala's tuples, and big data projects use those a lot.
- Reading examples in books that contain them both for scala and java shows how more complicated java code can be. I.e. I have a book for spark and one for akka. I quickly skip over the java examples and read through only the scala ones which are half in size (if not smaller) most of the time and easier to understand.
These all result in scala coding to be easier to do, easier to maintain, with less bugs and more scalable.