Cracking The Coding Interview/Q 1.5

From Software Engineers Wiki
Jump to: navigation, search

Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. If the "compressed" string would not become smaller than the original string, your method should return the original string.

Answer

This program consists of two passes. At the first pass, it doesn't create a compressed string, but returns the whether it can be compressed, and the length of new string if possible. If it can be compressed, the second pass actually generates the compressed string.

int count_digits(int num)
{
        int digits;

        for (digits = 1; num >= 10; ++digits)
                num /= 10;

        return digits;
}

/* 
 * cpSrc: original string
 * cpDst: destinatin string. If null pointer, no compression happends.
 *
 * return value: If cpDst is not a null pointer, return 0.
 * If new string is shorter than the original, returns the
 * length of new string. Otherwise, return negative number.
 */

int compress_repeat_prim(const char *cpSrc, char *cpDst)
{
        char lastChar;
        int lastCount;
        int lenSrc, lenDst;

        if (*cpSrc == '\0') {
                if (cpDst)
                        *cpDst = '\0';
                return -1;
        }

        lastChar = *cpSrc;
        lastCount = 1;
        lenSrc = 0;
        lenDst = 0;

        do {
                ++cpSrc;
                ++lenSrc;

                if (*cpSrc != lastChar) {
                        if (cpDst) {
                                *cpDst++ = lastChar;
                                cpDst += sprintf(cpDst, "%d", lastCount);
                        } else {
                                lenDst += 1 + count_digits(lastCount);
                        }
                        lastChar = *cpSrc;
                        lastCount = 1;
                } else
                        ++lastCount;
        } while (*cpSrc != '\0');

        if (cpDst != NULL) {
                *cpDst = '\0';
                return 0;
        }

        if (lenDst < lenSrc)
                return lenDst;
        else
                return -1;
}

char *compress_repeat(const char *str)
{
        int newLen;
        char *newStr;

        newLen = compress_repeat_prim(str, NULL);
        if (newLen < 0)
                return (char *) str;

        newStr = (char *) malloc(newLen + 1);
        if (newStr == NULL)
                return NULL;

        compress_repeat_prim(str, newStr);

        return newStr;
}

Simple test program.

int main(void)
{
        static const char *samples[] = {
                "",
                "aabbccddee",
                "aaabbbcccddd",
                "aaabcccd",
                "aaaabcccd",
                "aabbccccd",
                "aaaaaaaaaa",
        };

        int i;
        char *newStr;

        for (i = 0; i < sizeof(samples) / sizeof(samples[0]); ++i) {
                printf("%s: ", samples[i]);
                newStr = compress_repeat(samples[i]);
                if (newStr != samples[i]) {
                        printf("%s\n", newStr);
                        free(newStr);
                } else
                        printf("Can't be compressed\n");
        }

        return 0;
}

And the test results.

: Can't be compressed
aabbccddee: Can't be compressed
aaabbbcccddd: a3b3c3d3
aaabcccd: Can't be compressed
aaaabcccd: a4b1c3d1
aabbccccd: a2b2c4d1
aaaaaaaaaa: a10
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox