Count number of changes (manipulations) needed to make an anagram in immutable/pure functional Scala with foldLeft and a MultiSet by William Narmontas (Scala Algorithm #4)

Problem

Count number of changes needed to make. This problem is similar to:

One string is an anagram of another if they have the same number of each character, not necessarily in the same order - for example abcc is an anagram of accb and vice-versa.

ab is composed of a and b, and exchanging a to b is enough to create an anagram - ie 1 change of letter is enough.

abc is not possible to create anagrams of because it cannot be split.

poeq requires 2 changes, to create an anagram, for example to pqpq, popo, eoeo and so forth.

Solution

View on Scastie (online Scala code playground, in a new window) if you would like to play around with this code, run it, add new test cases, all from your web browser without installing Scala.

/** Return None is not possible - otherwise return the number of changes needed **/
def anagramChanges(input: String): Option[Int] = {
  val isEvenLength = input.length % 2 == 0
  if (!isEvenLength) None
  else {
    input.splitAt(input.length / 2) match {
      case (firstHalf, secondHalf) =>
        val changesNeeded = MultiSet
          .from(firstHalf)
          .diff(MultiSet.from(secondHalf))
          .counts
          .values
          .sum / 2

        Some(changesNeeded)
    }
  }
}

def testCases(f: String => Option[Int]): Unit = {
  def expect(input: String)(toRequireChanges: Option[Int]): Unit = {
    val result = f(input)
    assert(
      result == toRequireChanges,
      s"Expected result ${toRequireChanges} for input ${input}; got $result"
    )
  }
  expect("aaabbb")(toRequireChanges = Some(3))
  expect("ab")(toRequireChanges = Some(1))
  expect("mnop")(toRequireChanges = Some(2))
  expect("xyyx")(toRequireChanges = Some(0))
  expect("xaxbbbxx")(toRequireChanges = Some(1))
  expect("abc")(toRequireChanges = None)
}

testCases(anagramChanges)

final case class MultiSet[V](counts: Map[V, Int]) {
  def include(v: V): MultiSet[V] = includeN(v, 1)

  def includeN(v: V, n: Int): MultiSet[V] =
    copy(counts = counts.updated(v, counts.getOrElse(v, 0) + n))

  def diff(other: MultiSet[V]): MultiSet[V] = {
    (counts.keySet ++ other.counts.keySet).view
      .map { value =>
        value -> (Math
          .abs(counts.getOrElse(value, 0) - other.counts.getOrElse(value, 0)))
      }
      .collect {
        case (value, count) if count > 0 => value -> count
      }
      .foldLeft(MultiSet.empty[V]) {
        case (multiSet, (value, count)) => multiSet.includeN(value, count)
      }
  }
}

object MultiSet {
  def empty[V]: MultiSet[V] = MultiSet(Map.empty)
  def from[V](t: IterableOnce[V]): MultiSet[V] =
    t.iterator.foldLeft(MultiSet.empty[V])(_.include(_))
}

Explanation

Once we split a string into 2 parts, we are left with 2 strings. Number of changes required is effectively the number of characters that are not in common.

If we're counting a "difference" between two "sets", then we need a set differentiation operation. But this set is of a particular type - we have counters, so in fact it is a Map-like type. There is a name for it and it is a MultiSet (I discovered this while working out a solution for this problem).

Once we've figured out this aspect, the problem boils down to find the diff between these two sets, which is a generic solution independent of String.

This gives us the count for each character that requires replacing on both sides - we sum this count and divide by 2 eliminate the duplicates.

Scala aspects

`foldLeft`
This is like a for-loop iterating through each item, and returning a final result.
`collect`
.collect { case x if f(x) => g(x) } is the same as .filter(x => f(x)).map(x => g(x)) but much more concise in the sense that we can use a "Partial function" to produce a result right next to filtering, eliminating unnecessary steps.