Did you ever saw a char and thought: "Damn, 1 byte for a single char is pretty darn inefficient"? No? Well I did. So what I decided to do instead is to pack 5 chars, convert each char to a 2 digit integer and then concat those 5 2 digit ints together into one big unsigned int and boom, I saved 5 chars using only 4 instead of 5 bytes. The reason this works is, because one unsigned int is a ten digit long number and so I can save one char using 2 digits. In theory you could save 32 different chars using this technique (the first two digits of an unsigned int are 42 and if you dont want to account for a possible 0 in the beginning you end up with 32 chars). If you would decide to use all 10 digits you could save exactly 3 chars. Why should anyone do that? Idk. Is it way to much work to be useful? Yes. Was it funny? Yes.

Anyone whos interested in the code: Heres how I did it in C: https://pastebin.com/hDeHijX6

Yes I know, the code is probably bad, but I do not care. It was just a funny useless idea I had.

(page 2) 50 comments

sorted by: hot top controversial new old

[–] UnPassive@lemmy.world 2 points 23 hours ago (4 children)

I have a coworker who does stuff like this and it's always low-benefit optimizations that cost the team time to interface with - but I do still kind of love it

load more comments (4 replies)

[–] solrize@lemmy.ml 33 points 1 day ago

Back in the day those tricks were common. Some PDP-11 OS's supported a "Radix-50" encoding (50 octal = 40 decimal) that packed 3 characters into a 16 bit word (40 codepoints=26 letters, 10 digits, and a few punctuation). So you could have a 6.3 filename in 3 words.

[–] BartyDeCanter@lemmy.sdf.org 32 points 1 day ago (7 children)

Oh god that switch statement. Here, let me come up with something better:

if (pChar >= 'a' && pChar <= 'z') {
  return pChar - 'a' + 10;
} else if (pChar == ' ') {
  return 36;
} else if (pChar == '.'){
  return 37;
}
return 0;

[–] cows_are_underrated@feddit.org 8 points 1 day ago

Ah, thats cool. Did not knew you could that.. Thanks.

load more comments (6 replies)

[–] firelizzard@programming.dev 17 points 1 day ago (2 children)

Does the efficiency of storage actually matter? Are you working on a constrained system like a microcontroller? Because if you’re working on regular software, supporting Unicode is waaaaaaaaaaay more valuable than 20% smaller text storage.

[–] ryannathans@aussie.zone 20 points 1 day ago

Unicode? Sir this is C, if the character doesn't fit into a uint8 it's scope creep and too hard

[–] cows_are_underrated@feddit.org 3 points 1 day ago (1 children)

I do sometimes work with microcontrollers, but so far I have not encountered a condition where these minimal savings could ever be useful.

load more comments (1 replies)

[–] anton@lemmy.blahaj.zone 21 points 1 day ago (1 children)

You could save 0.64 bit per char more if you actually treated you output as a binary number (using 6 bits per char) and didn't go through the intermediary string (implicitly using base 100 at 6.64 bits per char).
This would also make your life easier by allowing bit manipulation to slice/move parts and reducing work for the processor because base 100 means integer divisions, and base 64 means bit shifts. If you want to go down the road of a "complicated" base use base 38 and get similar drawbacks as now, except only 5.25 bits per char.

[–] Redkey@programming.dev 7 points 1 day ago

I was so triggered by the conversion from char-to-int-to-string-to-packedint that I had to write a bitwise version that just does char-to-packedint (and back again), with bitwise operators.

https://pastebin.com/V2An9Xva

As others have pointed out, there are probably better options for doing this today in most real-life situations, but it might make sense on old low-spec systems if not for all the intermediate conversion steps, which is why I wrote this.

[–] AllHailTheSheep@sh.itjust.works 1 points 23 hours ago

in the same vein (storing more data in less bits) you should check out tagged pointers as well!

I don't think that's a useless implementation at all. code looks relatively clean, and it definitely has its uses in the embedded systems world.

[–] magic_lobster_party@fedia.io 17 points 1 day ago (4 children)

It’s all fun and games until the requirement changes and you need to support uppercase letters and digits as well.

load more comments (4 replies)

[–] carrylex@lemmy.world 10 points 1 day ago

[–] Dumhuvud@programming.dev 8 points 1 day ago (1 children)

In typical C fashion, there's undefined behavior in turn_char_to_int. xD

load more comments (1 replies)

[–] hdsrob@lemmy.world 9 points 1 day ago (1 children)

We have a binary file that has to maintain compatibility with a 16 bit Power Basic app that hasn't been recompiled since '99 or '00. We have storage for 8 character strings in two ints , and 12 character string in two ints and two shorts.

[–] cows_are_underrated@feddit.org 7 points 1 day ago

Damn, that are setups where you have to get creative.

[–] InternetCitizen2@lemmy.world 8 points 1 day ago

Yes I know, the code is probably bad, but I do not care

That's why we love it.

[–] CookieOfFortune@lemmy.world 5 points 1 day ago* (last edited 1 day ago)

I mean… you’d get better results for large data sets by just using a known compression algorithm. This is only viable for situations where you only have a small amount of data, enough computation to run this conditional, but not enough computation to run compression/decompression.

[–] Kolanaki@pawb.social 4 points 1 day ago* (last edited 1 day ago)

I'm a simple man. Here's my list of variable names:

Var1

Var2

Var3

Var4

Var5

...

load more comments