The Wayback Machine - https://web.archive.org/web/20160609084324/http://codegolf.stackexchange.com/questions/75935/split-a-byte-array-into-a-bit-array
Programming Puzzles & Code Golf Stack Exchange is a question and answer site for programming puzzle enthusiasts and code golfers. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

Write a function that when given a buffer b (1 - 104857600 bytes long) and a number of bits n (1 <= n <= 64), splits the buffer into chunks of n bits. Right-pad the last chunk with 0s up to n bits.

e.g.

Given the buffer b = "f0oBaR" or equivalently [102,48,111,66,97,82] and n = 5, return

[12, 24, 24, 6, 30, 16, 19, 1, 10, 8] 

This is because the above buffer, when represented as binary looks like:

01100110 00110000 01101111 01000010 01100001 01010010 

And when re-grouped into 5s looks like:

01100 11000 11000 00110 11110 10000 10011 00001 01010 010[00] 

Which when converted back into decimal gives the answer.

Notes

  • You may use whatever data type makes the most sense in your language to represent the buffer. In PHP you'd probably use a string, in Node you might want to use a Buffer
    • If you use a string to represent the buffer, assume it's ASCII for the char -> int conversion
    • You may use an array of ints (0-255) for input if you prefer
  • Return value must be an array or list of ints

Test Cases

> b = "Hello World", n = 50 318401791769729, 412278856237056 > b = [1,2,3,4,5], n = 1 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1 > b = "codegolf", n = 32 1668244581, 1735355494 > b = "codegolf" n = 64 7165055918859578470 > b = "codegolf" n = 7 49, 91, 108, 70, 43, 29, 94, 108, 51, 0 > b = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque vel est eu velit lacinia iaculis. Nulla facilisi. Mauris vitae elit sapien. Nullam odio nulla, laoreet at lorem eu, elementum ultricies libero. Praesent orci elit, sodales consectetur magna eget, pulvinar eleifend mi. Ut euismod leo ut tortor ultrices blandit. Praesent dapibus tincidunt velit vitae viverra. Nam posuere dui quis ipsum iaculis, quis tristique nisl tincidunt. Aliquam ac ligula a diam congue tempus sit amet quis nisl. Nam lacinia ante vitae leo efficitur, eu tincidunt metus condimentum. Cras euismod quis quam vitae imperdiet. Ut at est turpis.", n = 16 19567, 29285, 27936, 26992, 29557, 27936, 25711, 27759, 29216, 29545, 29728, 24941, 25972, 11296, 25455, 28275, 25955, 29797, 29813, 29216, 24932, 26992, 26995, 25449, 28263, 8293, 27753, 29742, 8272, 25964, 27749, 28276, 25971, 29045, 25888, 30309, 27680, 25971, 29728, 25973, 8310, 25964, 26996, 8300, 24931, 26990, 26977, 8297, 24931, 30060, 26995, 11808, 20085, 27756, 24864, 26209, 25449, 27753, 29545, 11808, 19809, 30066, 26995, 8310, 26996, 24933, 8293, 27753, 29728, 29537, 28777, 25966, 11808, 20085, 27756, 24941, 8303, 25705, 28448, 28277, 27756, 24876, 8300, 24943, 29285, 25972, 8289, 29728, 27759, 29285, 27936, 25973, 11296, 25964, 25965, 25966, 29813, 27936, 30060, 29810, 26979, 26981, 29472, 27753, 25189, 29295, 11808, 20594, 24933, 29541, 28276, 8303, 29283, 26912, 25964, 26996, 11296, 29551, 25697, 27749, 29472, 25455, 28275, 25955, 29797, 29813, 29216, 28001, 26478, 24864, 25959, 25972, 11296, 28789, 27766, 26990, 24946, 8293, 27749, 26982, 25966, 25632, 28009, 11808, 21876, 8293, 30057, 29549, 28516, 8300, 25967, 8309, 29728, 29807, 29300, 28530, 8309, 27764, 29289, 25445, 29472, 25196, 24942, 25705, 29742, 8272, 29281, 25971, 25966, 29728, 25697, 28777, 25205, 29472, 29801, 28259, 26980, 30062, 29728, 30309, 27753, 29728, 30313, 29793, 25888, 30313, 30309, 29298, 24878, 8270, 24941, 8304, 28531, 30053, 29285, 8292, 30057, 8305, 30057, 29472, 26992, 29557, 27936, 26977, 25461, 27753, 29484, 8305, 30057, 29472, 29810, 26995, 29801, 29045, 25888, 28265, 29548, 8308, 26990, 25449, 25717, 28276, 11808, 16748, 26993, 30049, 27936, 24931, 8300, 26983, 30060, 24864, 24864, 25705, 24941, 8291, 28526, 26485, 25888, 29797, 28016, 30067, 8307, 26996, 8289, 28005, 29728, 29045, 26995, 8302, 26995, 27694, 8270, 24941, 8300, 24931, 26990, 26977, 8289, 28276, 25888, 30313, 29793, 25888, 27749, 28448, 25958, 26217, 25449, 29813, 29228, 8293, 29984, 29801, 28259, 26980, 30062, 29728, 28005, 29813, 29472, 25455, 28260, 26989, 25966, 29813, 27950, 8259, 29281, 29472, 25973, 26995, 28015, 25632, 29045, 26995, 8305, 30049, 27936, 30313, 29793, 25888, 26989, 28773, 29284, 26981, 29742, 8277, 29728, 24948, 8293, 29556, 8308, 30066, 28777, 29486 > b = [2,31,73,127,179,233], n = 8 2, 31, 73, 127, 179, 233 
share|improve this question
2  
Is it supposed to work for values of n greater than 8? If so, what about values of n greater than 64, which is larger than most language's integer precision. – speedplaneMar 22 at 5:23
1  
Why does the return value have to be ints? – wizzwizz4Mar 22 at 7:19
    
can you add some additional test cases pls ? – ErwanMar 22 at 10:37
1  
@wizzwizz4 I don't think so. They can't be bytes because they don't have 8 bits. Bitwise operators normally work on ints and not much else. If you have a better suggestion then I'm listening, but otherwise ints it is. – mpenMar 22 at 20:22
2  
@wizzwizz4 Because I don't want people to be able to skip a step. I don't want answers like "the first 5 bits of this byte contain the answer" -- the result should not contain any superfluous information, and it should be easily converted back to ASCII or some character mapping (a real-life use-case). Also, given the number of answers so far, it doesn't appear to be a problem. – mpenMar 22 at 21:40

16 Answers 16

Pyth, 18 17 bytes

iR2c.[t.B+C1z\0QQ 

Thanks to @lirtosiast for a byte!

 z get input +C1 prepend a 0x01 to prevent leading zeroes from disappearing .B convert to binary string t remove the leading 1 from ^^ .[ \0Q pad right with zeroes to multiple of second input c Q get chunks/slices of length second input iR2 map(x: int(x, 2)) 
share|improve this answer

Jelly, 13 bytes

1;ḅ256æ«BḊsḄṖ 

This takes the input as a list of integers. Try it online!

How it works

1;ḅ256æ«BḊsḄṖ Main link. Arguments: A (list), n (integer) 1; Prepend 1 to A. ḅ256 Convert from base 256 to integer. æ« Bitshift the result n units to the left. B Convert to binary. Ḋ Discard the first binary digit (corresponds to prepended 1). s Split into chunks of length n. Ḅ Convert each chunk from binary to integer. Ṗ Discard the last integer (corresponds to bitshift/padding). 
share|improve this answer

JavaScript (ES6), 120 bytes

f=(a,n,b=0,t=0,r=[])=>b<n?a.length?f(a.slice(1),n,b+8,t*256+a[0],r):b?[...r,t<<n-b]:r:f(a,n,b-=n,t&(1<<b)-1,[...r,t>>b]) 

Recursive bit twiddling on integer arrays. Ungolfed:

function bits(array, nbits) { var count = 0; var total = 0; var result = []; for (;;) { if (nbits <= count) { // We have enough bits to be able to add to the result count -= nbits; result.push(total >> count); total &= (1 << count) - 1; } else if (array.length) { // Grab the next 8 bits from the array element count += 8; total <<= 8; total += array.shift(); } else { // Deal with any leftover bits if (count) result.push(total << nbits - count); return result; } } } 
share|improve this answer
    
@WashingtonGuedes I managed to golf another 9 bytes off my own golf of your solution, but it's still 129 bytes, sorry: "(s,n)=>(s.replace(/./g,x=>(256+x.charCodeAt()).toString(2).slice(1))+'0'.repea‌​t(n-1)).match(eval(`/.{${n}}/g`)).map(x=>+`0b${x}`)".length – NeilMar 22 at 9:57
    
You sure this one runs? The ungolfed version is crashing Chrome. – mpenMar 22 at 16:20
    
@mpen The golfed version definitely runs on Firefox. The ungolfed version may have errors in it. – NeilMar 22 at 16:34
    
Aha! And so it does. I thought Chrome's JS engine was ahead FF's but I guess not. – mpenMar 22 at 20:26
1  
@mpen Fixed a couple of subtle bugs in my ungolfed code for you. – NeilMar 22 at 21:45

Julia, 117 bytes

f(x,n,b=join(map(i->bin(i,8),x)),d=endof,z=rpad(b,d(b)+d(b)%n,0))=map(i->parse(Int,i,2),[z[i:i+n-1]for i=1:n:d(z)-n]) 

This is a function that accepts an integer array and an integer and returns an integer array. It's an exercise in function argument abuse.

Ungolfed:

function f(x::Array{Int,1}, # Input array n::Int, # Input integer b = join(map(i -> bin(i, 8), x)), # `x` joined as a binary string d = endof, # Store the `endof` function z = rpad(b, d(b) + d(b) % n, 0)) # `b` padded to a multiple of n # Parse out the integers in base 2 map(i -> parse(Int, i, 2), [z[i:i+n-1] for i = 1:n:d(z)-n]) end 
share|improve this answer
    
Why did you temporarily delete it? – CalculatorFelineMar 22 at 14:25
    
@CatsAreFluffy I realized I had done something wrong initially such that it worked for the test case but wouldn't necessarily in general. Should be all good now though. :) – Alex A.Mar 22 at 17:06

Ruby, 114 bytes

->s,n{a=s.bytes.map{|b|b.to_s(2).rjust 8,?0}.join.split"" r=[] r<<a.shift(n).join.ljust(n,?0).to_i(2)while a[0] r} 

Slightly cleaner:

f = -> str, num { arr = str.bytes.map {|byte| byte.to_s(2).rjust(8, "0") }.join.split("") result = [] while arr.size > 0 result << arr.shift(num).join.ljust(num, "0").to_i(2) end result } puts f["f0oBaR", 5] 
share|improve this answer

Python 3, 102 bytes

j=''.join lambda s,n:[int(j(k),2)for k in zip(*[iter(j([bin(i)[2:].zfill(8)for i in s+[0]]))]*n)][:-1] 

use iter trick to group string

Results

>>> f([102,48,111,66,97,82],4) [6, 6, 3, 0, 6, 15, 4, 2, 6, 1, 5, 2, 0] >>> f([102,48,111,66,97,82],5) [12, 24, 24, 6, 30, 16, 19, 1, 10, 8] >>> f([102,48,111,66,97,82],6) [25, 35, 1, 47, 16, 38, 5, 18] >>> f([102,48,111,66,97,82],8) [102, 48, 111, 66, 97, 82] 
share|improve this answer

Perl 6, 93 68 bytes

{@^a».&{sprintf "%08b",$_}.join.comb($^b)».&{:2($_~0 x$b-.chars)}} 
share|improve this answer

PHP, 262217189 bytes

function f($b,$n){$M='array_map';return$M('bindec',$M(function($x)use($n){return str_pad($x,$n,0);},str_split(implode('',$M(function($s){return str_pad($s,8,0,0);},$M('decbin',$b))),$n)));} 

(updated with tips from Ismael Miguel)

Formatted for readability:

function f($b, $n) { $M = 'array_map'; return $M('bindec', $M(function ($x) use ($n) { return str_pad($x, $n, 0); }, str_split(implode('', $M(function ($s) { return str_pad($s, 8, 0, 0); }, $M('decbin', $b))), $n))); } 

Example:

> implode(', ',f(array_map('ord',str_split('f0oBaR')),5)); "12, 24, 24, 6, 30, 16, 19, 1, 10, 8" 
share|improve this answer
1  
Instead of str_pad($s,8,'0',STR_PAD_LEFT), you can use str_pad($s,8,0,0). You can remove the quotes on bindec and decbin to save 4 bytes. To save more, you can store array_map in a variable and pass it instead. Here you go: function f($b,$n){$M=array_map;return$M(bindec,$M(function($x)use($n){return str_pad($x,$n,0);},str_split($M('',array_map(function($s){return str_pad($s,8,0,0);},$M(decbin,$b))),5)));} (184 bytes). – Ismael MiguelMar 22 at 9:30
    
Thanks @IsmaelMiguel I think you replaced the implode with $M too though. – mpenMar 22 at 15:50
1  
If I did, it was by mistake. I'm really sorry. But I'm glad you liked my variation of your code. – Ismael MiguelMar 22 at 15:54

CJam, 30 bytes

{_@{2b8 0e[}%e_0a@*+/-1<{2b}%} 

Try it online!

This is an unnamed block which expects the int buffer and the amount of chunks on the stack and leaves the result on the stack.

Decided to give CJam a try. Only took me 2 hours to get it done ^^ This is probably too long, suggestions are very welcome!

Explanation

 _ e# duplicate the chunk count @ e# rotate stack, array now on top and chunk counts on the bottom { e# start a new block 2b e# convert to binary 8 0e[ e# add zeros on the left, so the binary is 8 bits } e# end previous block % e# apply this block to each array-element (map) e_ e# flatten array 0a e# push an array with a single zero to the stack @ e# rotate stack, stack contain now n [array] [0] n * e# repeat the array [0] n times + e# concat the two array / e# split into chunks of length n, now the stacks only contains the array -1< e# discard the last chunk {2b}% e# convert every chunk back to decimal 
share|improve this answer

JavaScript (ES6) 104

Iterative bit by bit fiddling,

Edit 5 bytes save thx @Neil

(s,g,c=g,t=0)=>(s.map(x=>{for(i=8;i--;--c||(s.push(t),c=g,t=0))t+=t+(x>>i)%2},s=[]),c-g&&s.push(t<<c),s) 

Less golfed

( // parameters s, // byte source array g, // output bit group size // default parameters used as locals c = g, // output bit counter t = 0 // temp bit accumulator ) => ( s.map(x => { // for each byte in s for(i = 8; // loop for 8 bits i--; ) // loop body t += t + (x>>i) % 2, // shift t to left and add next bit --c // decrement c,if c==0 add bit group to output and reset count and accumulator ||(s.push(t), c=g, t=0) }, s=[] // init output, reusing s to avoid wasting another global ), c-g && s.push(t<<c), // add remaining bits, if any s // return result ) 

Test

f=(s,g,c=g,t=0)=>(s.map(x=>{for(i=8;i--;--c||(s.push(t),c=g,t=0))t+=t+(x>>i)%2},s=[]),c-g&&s.push(t<<c),s) function test() { var a = A.value.match(/\d+/g)||[] var g = +G.value var r = f(a,g) O.textContent = r K.innerHTML = a.map(x=>`<i>${(256- -x).toString(2).slice(-8)}</i>`).join`` + '\n'+ r.map(x=>`<i>${(256*256*256*256+x).toString(2).slice(-g)}</i>`).join`` } test()
#A { width: 50% } #G { width: 5% } i:nth-child(even) { color: #00c } i:nth-child(odd) { color: #c00 }
Input array <input id=A value="102,48,111,66,97,82"> Group by bits <input id=G value=5> (up to 32)<br> Output <button onclick="test()">-></button> <span id=O></span> <pre id=K></pre>

share|improve this answer
1  
Instead of doubling x each time, why not shift x right i bits? – NeilMar 22 at 21:34
    
@Neil uh...why...idiocy? – edc65Mar 22 at 22:23
    
I just noticed that c-g?[...s,t<<c]:s might save you a couple more bytes. – NeilMar 23 at 0:47
    
@Neil this requires some thoughts – edc65Mar 23 at 8:56

J, 24 bytes

[:#.-@[>\;@(_8:{."1#:@]) 

This is an anonymous function, which takes n as its left argument and b as numbers as its right argument.

Test:

 5 ([:#.-@[>\;@(_8:{."1#:@])) 102 48 111 66 97 82 12 24 24 6 30 16 19 1 10 8 

Explanation:

[:#.-@[>\;@(_8:{."1#:@]) #:@] NB. Convert each number in `b` to bits _8:{."1 NB. Take the last 8 items for each NB. (padding with zeroes at the front) ;@ NB. Make a list of all the bits -@[ NB. Negate `n` NB. (\ gives non-overlapping infixes if [<0) >\ NB. Get non-overlapping n-sized infixes [:#. NB. Convert those back to decimal 
share|improve this answer

Haskell, 112 109 bytes

import Data.Digits import Data.Lists n#x=unDigits 2.take n.(++[0,0..])<$>chunksOf n(tail.digits 2.(+256)=<<x) 

Usage example: 5 # [102,48,111,66,97,82] -> [12,24,24,6,30,16,19,1,10,8].

How it works

import Data.Digits -- needed for base 2 conversion import Data.Lists -- needed for "chunksOf", i.e. splitting in -- sublists of length n ( =<<x) -- map over the input list and combine the -- results into a single list: tail.digits 2.(+256) -- convert to base two with exactly 8 digits chunksOf n -- split into chunks of length n <$> -- convert every chunk (<$> is map) take n.(++[0,0..]) -- pad with 0s unDigits 2 -- convert from base 2 
share|improve this answer

Java, 313306 322 bytes

I hope this beats PHP... And nope. Stupid long function names.

-7 thanks to @quartata for getting rid of public +16 to fix an error when the split was exact, thanks to @TheCoder for catching it

int[] f(String b,int s){int i=0,o[]=new int[(int)Math.ceil(b.length()*8.0/s)],a=0;String x="",t;for(char c:b.toCharArray()){t=Integer.toString(c,2);while(t.length()<8)t="0"+t;x+=t;a+=8;while(a>=s){o[i++]=Integer.parseInt(x.substring(0,s),2);x=x.substring(s,a);a-=s;}}while(a++<s)x+="0";o[i]=Integer.parseInt(x,2);return o;} 
share|improve this answer
5  
I don't think you have to make the function public. – quartataMar 22 at 14:33
    
Thanks! I did not realize... – BlueMar 22 at 18:06
    
What version of Java did you run this in? It doesn't seem to compile: ideone.com/3tonJt – mpenMar 22 at 21:44
    
@mpen Ah, whoops. I forgot, I changed it on my computer before posting. Will fix. – BlueMar 23 at 1:41
    
@JackAmmo yup, sure did. Stupid tiny phone keyboard. – BlueMar 23 at 1:50

Powershell 146 bytes

param([int[]][char[]]$b,$n)-join($b|%{[convert]::ToString($_,2).PadLeft(8,"0")})-split"(.{$n})"|?{$_}|%{[convert]::ToInt32($_.PadRight($n,"0"),2)} 

Take in the buffer and convert it to a char array and then an integer array. For each of those convert to binary, pad the entries with 0's where needed, and join as one large string. Split that string on n characters and drop the blanks that are created. Each element from the split is padded (only the last element really would need it) and converted back into an integer. Output is an array

share|improve this answer

Python 3.5 - 312 292 bytes:

def d(a, b): o=[];o+=([str(bin(g)).lstrip('0b')if str(type(g))=="<class 'int'>"else str(bin(ord(g))).lstrip('0b')for g in a]);n=[''.join(o)[i:i+b]for i in range(0,len(''.join(o)),b)];v=[] for t in n: if len(t)!=b:n[n.index(t)]=str(t)+'0'*(b-len(t)) v+=([int(str(f),2)for f in n]) return v 

Although this may be long, this is, in my knowledge, is the shortest way to accept both functions and arrays without errors, and still being able to retain some accuracy in Python 3.5.

share|improve this answer

Java, 253 247 bytes

Golfed

int i,l,a[];Integer I;String f="";int[]c(String s,int n){for(char c:s.toCharArray())f+=f.format("%08d",I.parseInt(I.toString(c, 2)));while(((l=f.length())%n)>0)f+="0";for(a=new int[l=l/n];i<l;)a[i]=I.parseInt(f.substring(i*n,i++*n+n),2);return a;} 

UnGolfed

int i,l,a[]; Integer I; String f=""; int[]c(String s,int n) { for(char c:s.toCharArray()) f+=f.format("%08d",I.parseInt(I.toString(c,2))); while(((l=f.length())%n)>0) f+="0"; for(a=new int[l=l/n];i<l;) a[i]=I.parseInt(f.substring(i*n,i++*n+n),2); return a; } 
share|improve this answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.

close