|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcodesimian.CS<CSGeneric>
codesimian.DefaultCS
codesimian.SelfPredictingPrimeHashingNumberList
public class SelfPredictingPrimeHashingNumberList
THIS CLASS IS NOT FINISHED. 9/07 I still intend to finish building this.
The purpose of this algorithm is to play realtime audio from any subset of a compressed file,
and for each possible subset to predict the entire audio a little differently.
It will not matter which individual floats from the compressed file you load into memory,
any subset of single floats from any set of
indexs (in array bounds) can approximate the entire original audio data.
Use this to dynamicly choose which parts of the compressed file to load into memory at a time,
chosen by which were most needed to predict the values (-1 to 1) of audio that were requested.
Many times per second, load many new floats from the compressed file, and remove many floats from memory,
changing which subset of floats is loaded into memory but keeping the same size.
Example: 500,000 floats in memory and 5,000,000 floats in a file waiting to be loaded
and 50,000,000 floats in the original audio data. I dont know exactly what compression ratio this will get.
Use that different set of floats to better predict the new values requested (of the original audio data).
Predict which indexs (containing predictions of predictions... of audio data) will
be requested by which indexs were requested recently.
ALL ARRAY SIZES MUST BE PRIME. THE ARRAYS data[] AND prime[] MUST BE THE SAME SIZE.
IF YOU MUST HAVE A SPECIFIC ARRAY SIZE, CHOOSE THE SMALLEST
PRIME AT LEAST THAT SIZE AND FILL THE END OF THE ARRAY WITH ZEROS.
------------------ALGORITHM:------------------
Start with a float[] array (size must be prime) containing numbers between -1.0 and 1.0,
representing a sequence of audio samples.
A different list (same size as the float[] array) contains UNIQUE PRIME numbers,
each number's value at least 1000 times bigger than the float[] array's size (for good hashing).
Each number in the float[] array is paired with one of the prime numbers.
Each (value at an) index in the float[] array is approximately a prediction
derived from the values at many of the other indexs.
Each (value at an) index is paired with a distribution of its prime number hashing over
index quantity and divided by index quantity to give a set of numbers all between 0 and 1.
Values in the float[] array should slowly change so that any set of other (values at) indexs predicts them better.
The result of this change is that you can approximate any (value at an) index by using any set of other (values at) indexs,
and the more (i'm going to stop writing 'values at' here...) indexs you have, the better you can predict others.
This algorithm is a form of LOSSY COMPRESSION (not limited to but designed for: AUDIO).
By reordering the prime numbers (so each pairs with a different index than before),
you can find a float[] array that predicts itself better (and still is similar to the original audio data).
By adjusting some floats in the array more or less than the others eacy cycle,
the float[] array would learn differently therefore will become a slightly different array.
When the float[] array has been through enough cycles,
you can choose any arbitrary subset of floats and use them to predict any arbitrary subset
of floats from the uncompressed original audio data. Save this subset of floats as the compressed file.
You can even split the compression into multiple files, and any subset of the files combines to
a slightly different version of the audio, the same way subsets of file(s) can be combined.
But it does not blur times. Times are at exact array indexs.
It is easy to predict the distribution of 1 prime number hashed over array length divided by array length.
For the next biggest weight, add the prime to the current index and wrap around array length.
For the next smallest, subtract and wrap. Of course you can multiply and modulus it
to ***decrease the Big-O from linear to to constant***. It would work better to
use the square-root of [the index divided by array length] instead of index divided by array length
because influence of indexs on other indexs would be more localized (but still blurred),
and much less influence from/to the majority.
Save the whole (or part of) changed audio data to a file.
Load any part of the audio data by predicting it based on which parts
of the changed data are currently in memory (less than all of it).
Using the prediction described above, predict, then predict based on those predictions,
and form a small prediction-tree... to more accurately predict the
root of the tree (all indexs are roots, but you only play 1 at a time).
Using this method, you can get more accurate numbers when predicting from a small part of the compressed data.
After the compressing starts, the prime numbers are never reordered,
and should be saved in any file that contains a float[] array derived from those numbers.
The advantage of this algorithm for lossy audio compression is that
you can decompress any part of the audio from any subset of the compressed file.
You could take single floats from random places in the file and use them to predict any of the original audio data.
All floats are part of all original audio data, relative to an exponent (2? 3? 6.7?) and prime number hashing.
To do the next cycle, for each index, use all (or the most-influencing subset calculated by prime numbers),
to calculate the value of that index. Compare it to its current value.
Modify it to be a little closer to the predicted value.
After you do that for all indexs, the array is ready to predict itself and modify itself again.
Each time it becomes more distorted and more compressed and each moment of sound is spread more through the whole array.
For each index x, the sum of all weight(x,y) for all indexs y, should equal 1 (and none may be negative).
The final value at each index is a weighted sum of predictions of other indexs.
Combine this algorithm (above) with the neural network for natural-language and compiling I'm building 8/06???
6/07 There is 1 best natural language algorithm, and it can be applied to words or letters.
I know it will later be better to use doubles and longs,
but for now I have chosen floats for the audio data and ints for the primes.
This is necessary to compress smaller than MP3s, which use 16-bit numbers.
floats that are always between -1 and 1 waste memory. 9 bits of each float are exponent.
It would be better to use an int divided by 2^31.
THIS IS A COMPLEX ALGORITHM. THIS CLASS IS NOT FINISHED, BUT THE JAVADOC
PROBABLY CONTAINS EVERYTHING YOU NEED TO BUILD IT (BUT NOT ORGANIZED WELL).
--------------------FILE:---------------------
For better compression in the file, maybe the float[] array should instead be
saved as 24 bit unsigned integers that when interpreted as double (64 bits) are between 0.0 and 1.0 in memory.
Its probably more efficient for its range to be 0.0 to .999999999999999 instead of 1.0
because the bits would align better (without a crack near the middle),
therefore it should range 0 (inclusive) to 1 (exclusive).
Or maybe use less bits per number because normal audio is only 16 bits per number.
If you use anything other than the same list of prime numbers every time,
you need to save the primes in the file with the floats. Use ints (32 bits) for primes,
or for more efficiency, use a few less bits. Create a header which tells the sizes of:
the header, the float section, and the int section. If there are more floats than ints, wrap.
If there are more ints than floats, the extra ints are wasted and an error thrown.
Or pair one float and one int and repeat? Many possible ways to store a file...
...
A way that is usually smallest and always simplest: Dont reorder the primes. Specify a minimum prime
and quantity of primes, and the set of primes is known from that information.
You can compress without reording the primes. Thats just an optimization.
Field Summary |
---|
Fields inherited from class codesimian.CS |
---|
DESCRIPTION, END, EXECPROXY, HEAP, JAVACODE, MYFUEL, NAME, NEWINSTANCE, NULL, PARENT, PARSEPRIORITY, PREV, TESTER, THIS |
Constructor Summary | |
---|---|
SelfPredictingPrimeHashingNumberList()
|
Method Summary | |
---|---|
double |
cost()
cost() should be changed to return a float, and should be renamed to costToExecute() cost of EXECUTING this CS, not including any CSs it executes recursively. |
int |
defaultArraySize()
Must be prime. |
java.lang.String |
description()
Deprecated. public byte isIllusion(int index){ return 1; } public int countP(){ checkArrays(); return data.length; } |
double |
DForProxy()
modify data[] 1 cycle farther, to predict itself better. |
java.lang.String |
keyword()
For the CodeSimian language as a String. CodeSimian language keyword, like "+" "*" "max" ">" etc. Override this function if you want to specify a keyword other than how I derive them from the class name, like + for Add. Some CSs might never be intended to be used in the language by their keyword. The best example (4/05) is Num, because it is used in the language like "3.4" instead of "num()". |
int |
minP()
For DForProxy(). Minimum number of parameters in param[] needed to call DForProxy(). Defines which indexs of param[] DForProxy() can use. Functions with a different number of parameters must override this. OVERRIDE THIS FUNCTION IF EXEC USES A DIFFERENT NUMBER OF PARAMETERS. Default is 1. |
int |
minPrimeIndex()
2 is the first prime-number. |
CS |
P(int index)
WARNING: if add CSs then delete them, they are still in the param[] array and can be returned in this function, despite them being out of valid range: index at least countP(). |
double |
PD(int index)
Returns a param as a double. |
float |
PF(int index)
|
float |
predict(int index,
int useThisManyNumbersToPredictIt)
looks at useThisManyNumbersToPredictIt numbers in data[] and predicts the value at data[index] without looking at data[index] except by accident. |
float[] |
predictAll(int useThisManyNumbersToPredictEachIndex)
|
static void |
randomizeIndexs(int[] anyArray)
|
boolean |
setD(int index,
double value)
sets a param to a double value. |
boolean |
setF(int index,
float value)
|
boolean |
setP(int index,
CS value)
Every CS is a list of other CSs, between size minP() and maxP() inclusive. |
float |
weight(int nonlinearIndexsAhead)
weights may or may not be calculated by some function of (prime[x]*nonlinearIndexsAhead)%data.length. |
float |
weight(int predictThisIndex,
int observeMe)
returns the weight of how much the index observeMe should be used to predict the value at predictThisIndex. |
Methods inherited from class codesimian.DefaultCS |
---|
B, C, countP, decrementMyFuel, deleteP, F, fuel, getExec, getObject, heap, I, indexP, indexPName, insertB, insertC, insertD, insertF, insertI, insertJ, insertL, insertL, insertL1, insertP, insertS, insertZ, J, javaCode, LForProxy, LForProxy, myFuel, name, newInstance, objectToCS, objectToCSArray, objectToCSArray, prevD, prevL, PType, S, setB, setC, setCountP, setD, setExec, setFuel, setI, setJ, setL, setL, setL, setL1, setMyFuel, setName, setObject, setPrevExec, setPType, setS, setZ, start, toString, V, Z |
Methods inherited from class codesimian.CS |
---|
addB, addC, addD, addF, addI, addJ, addL, addP, addP, addP, addP, addP, addS, addZ, BForProxy, CForProxy, clone, D, deleteP, FForProxy, GETB, GETC, GETD, GETF, GETI, GETJ, GETL, GETS, GETZ, IForProxy, isIllusion, JForProxy, L, L, L, L, L, maxD, maxP, minD, overwrites, parent, parsePriority, PB, PC, PI, PJ, PL, prevB, prevC, prevF, prevI, prevJ, prevS, prevZ, proxyOf, PS, PZ, reflect, reflect, reflect6, setB, SETB, setC, SETC, setCost, SETD, setDescription, setF, SETF, setHeap, setI, SETI, setJ, SETJ, SETL, setL, setL, setParent, setParsePriority, setProxyOf, setS, SETS, setTester, setZ, SETZ, SForProxy, tester, toJavaCode, VForProxy, voidReflect, ZForProxy |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public SelfPredictingPrimeHashingNumberList()
Method Detail |
---|
public java.lang.String keyword()
DefaultCS
keyword
in class DefaultCS
CS.parent()
,
CS.newInstance()
,
CS.name()
public double DForProxy()
DForProxy
in class DefaultCS
public double cost()
CS
cost
in class CS
public int minP()
DefaultCS
minP
in class DefaultCS
public CS P(int index)
DefaultCS
P
in class DefaultCS
index
- range 0 (or neg?) to countP()-1 inclusiveCS.heap()
public double PD(int index)
CS
PD
in class CS
public float PF(int index)
PF
in class CS
CS.PD(int)
public boolean setP(int index, CS value)
CS
setP
in class DefaultCS
public boolean setD(int index, double value)
CS
setD
in class DefaultCS
public boolean setF(int index, float value)
setF
in class DefaultCS
CS.setD(int,double)
public java.lang.String description()
description
in class DefaultCS
public float[] predictAll(int useThisManyNumbersToPredictEachIndex)
public float predict(int index, int useThisManyNumbersToPredictIt)
public int defaultArraySize()
public int minPrimeIndex()
public float weight(int predictThisIndex, int observeMe)
public float weight(int nonlinearIndexsAhead)
public static void randomizeIndexs(int[] anyArray)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |