Problem:
Given two decimal numbers as strings, return their product as a string. Requirements:
- Do not use built-in big number arithmetic.
- String length ≤ 110
- Inputs have no leading zeros
- Strings contain only digits
Thoughts
Initially, I didn’t notice the prohibition on using built-in functions, so I directly used Python’s str
and int
functions, which ranked in the top 10%… Later, I realized the problem doesn’t allow built-in functions. I’m not sure if this Python feature counts as built-in big number arithmetic. This is a big number multiplication problem, and there are many established algorithms for this.
My initial Python submission, which beat 88% of submissions:
class Solution(object):
def multiply(self, num1, num2):
return str(int(num1)*int(num2))
The most basic algorithm would be to simulate manual multiplication. First, you need to implement string addition and multi-digit multiplication by a single digit. With these, you can calculate multi-digit by multi-digit multiplication. For example, to calculate 12345*67890
, you compute 12345*6 + 12345*7 + 12345*8 + 12345*9 + 12345*0
, then add trailing zeros to each result and sum them. However, this approach would be relatively slow.
Moreover, CPUs can already handle additions that don’t overflow, so we should leverage this. We can improve the above algorithm by calculating 12345*67890
as (12300 + 45)*(67800 + 90)
. This breaks down into 4 multiplication operations. For numbers with trailing zeros, we can remove them and add them back to the result later. If none of these four multiplications overflow, there’s no problem. Otherwise, we can continue decomposing.
The Karatsuba algorithm can further improve this approach.
We notice that in the addition, there’s 12300*90+45*67800
. We can use previously calculated results, namely 12300*67800
and 45*90
, and then calculate (12300+45)*(67800+90) - 12300*67800 - 45*90
to get 12300*90 + 45*67800
. This reduces the number of multiplication operations by one.
Implementing the Karatsuba algorithm would be a bit more complex, so I’ll first submit an O(n²) algorithm (the basic version) to see its performance, and then improve it. (Results show that the manual calculation method is already quite fast.)
First Version
First, we need to write a string addition algorithm. Looking at the input and output data types, they’re strings. So we can add them digit by digit. We can use two integer arrays to store each digit, then add them to create a third array. Some digits in this third array will exceed 10, so we carry over from the lower digits to the higher ones. Finally, we convert this array back to a string.
Although local testing was fine, the submission was very slow, ranking only at the 10th percentile.
Improvement
Theoretically, this algorithm shouldn’t be slow, but in practice it is. The issue might be with unnecessary conversions between integers and strings. In the above algorithm, when calculating multiplication, we converted strings to numbers and then back to strings, which might be causing the extra time. So, we should directly add the results to the final result.
After submission, the runtime was 9ms, beating 50% of submissions.
Further Improvement
Repeatedly converting the same strings might be consuming time. We can cache string conversions - convert once and store the result, then retrieve directly without additional calculations when needed.
The final code, ready for compilation and execution. Runtime: 6ms, beating 76% of submissions. It seems string conversion was indeed the performance bottleneck.
#include <stdio.h>
#include <string>
#include <stdlib.h>
#include <iostream>
#include <vector>
using namespace std;
const int BINARY = 10;
class Solution
{
private:
vector<int> result;
string _num1, _num2;
long num1buff[120];
long num2buff[120];
public:
string multiply(string num1, string num2)
{
result.clear();
result.resize(num1.length() + num2.length() + 1);
memset(num1buff, -1 , sizeof(long)*120);
memset(num2buff, -1 , sizeof(long)*120);
_num1 = num1;
_num2 = num2;
for(auto &c : _num1){
c-='0';
}
for(auto &c : _num2){
c-='0';
}
// This is a recursive process. Let's see when it terminates.
// It terminates when two numbers multiplied don't overflow.
// Assuming int is 30 bits in binary (for multiplication, we need 30 bits),
// the original two numbers must be 15 bits, which is about 32768.
// So two 4-digit numbers multiplied shouldn't overflow.
// Using recursion to calculate the product
addMultiply(0,num1.length(), 0, num2.length());
string ret;
int i = result.size() -1;
for(; i>0; i--)
{
if(result[i] != 0) break;
}
for(; i>=0; i--)
{
ret.push_back(result[i] + '0');
}
return ret;
}
void addMultiply(int a1, int a2, int b1, int b2 )
{;
// Check if direct calculation is possible
if (a1 == a2 || b1 == b2)
return;
if (a2 - a1 < 10 && b2 - b1 < 10)
{
long int_num1 = getLong1(a1, a2);
long int_num2 = getLong2(b1, b2);
long output = int_num1 * int_num2;
int pos = _num1.length() + _num2.length() - a2 - b2;
while (output != 0 || result[pos] >= BINARY)
{
long a = output % BINARY;
result[pos] += a;
result[pos + 1] += result[pos] / BINARY;
result[pos] %= BINARY;
output /= BINARY;
pos++;
}
return;
}
// Otherwise, split the longer number
if(a2 - a1 >= 10){
addMultiply(a1, (a2 + a1)/2, b1, b2);
addMultiply((a2 + a1)/2, a2, b1, b2);
}
else {
addMultiply(a1, a2, (b1+b2)/2, b2);
addMultiply(a1, a2, b1, (b1+b2)/2);
}
}
long getLong1(int a, int b){
long ret = 0;
if(num1buff[a] != -1) return num1buff[a];
for(int i=a; i!=b;i++){
ret *= BINARY;
ret += _num1[i] ;
}
num1buff[a] = ret;
return ret;
}
long getLong2(int a, int b){
long ret = 0;
if(num2buff[a] != -1) return num2buff[a];
for(int i=a; i!=b;i++){
ret *= BINARY;
ret += _num2[i] ;
}
num2buff[a] = ret;
return ret;
}
};
int main(void)
{
Solution s;
for(int i=0;i<10000;i++){
cout << s.multiply("12345678901", "100") << endl;
cout << s.multiply("100", "100") << endl;
}
return 0;
}
Final ranking: beats 70% of submissions