I was discussing some magic numbers a collegue had in his code and was pondering if some macro magic could help. The problem is converting a timestamp from a hardware timer to a value in microseconds. The usual trick of multiplying by a constant then shifting left n times was used. My collegue used a spreadsheet to calculate the multiplier and the shift as a power of 2. The spreadsheet calc'd the best combination. Normally, i'd just choose a handy shift like 16 or so and work that back to the multiplier. This would be easy in a macro. Can anyone suggest a method to solve the above problem using the C preprocessor? And not using something like perl or python!

## Macro to calc mantissa and exponent

**Kartman wrote:**

And not using something like perl or python!

Even though it's 1,000 times easier?

there is no magic!

The general rules is first to decide 8 16 24 ... bit's for the constant.

then you find your number, then you shift it all the way up.

like div with 100 in 16 bit give:

mul with 655(.36)

now you shift up so it above 32768

that is 41943(.04)

That is by def. the best "raw" number. (perhaps the half number is as good but never better, on an AVR without a barrel shifter it's sometimes better to got 16 without shift, than 8 bit with! )

But it depends of what you need the number for, just a factor, you are done

But used for anything accurate you might play with offset numbers.

Perhaps there is some code macro to find the correct PLL for a tuner, or setting baudrate for CPU's with fraction, that is the same kind of problem.

Unfortunately, solving one problem might create a few more if we use scripting! As sparrow says, it's like calculating a PLL divisor. You need to iterate to find the best solution and my pre-processor fu is not strong. It strikes me as one of the problems that someone has figured a classy solution, but I couldn't quite figure out some search terms to look for.

I'm a bit confused. If "preprocessor", at build time, then why can't external tools be used to find the number(s) to plug into the source?

External tools are currently used! The human in this instance. The challenge is: can the C preprocessor be used? I did some reading and my solution was to use a chain of ifs which was not pretty.

The challenge is: can the C preprocessor be used?

Not elegantly, if at all. There is no facility for iteration in the preprocessor.

I've done what you're trying to do with a host-side C helper program. Like Cliff says, Perl or Python would work as well.

Start here for some algorithms which you can adapt to a host-side helper program.

http://embeddedgurus.com/stack-overflow/2009/06/division-of-integers-by-constants/

http://www.easysurf.cc/cnver17.htm#b10tob2

I can PM you one I used for a recent project, but it's a dog's breakfast of which I am not especially proud :)

Joey, i came to the same conclusion after some reading. I thought i'd put it out there to see if there was a possibility. I'd never really done anything more than basic stuff with the preprocessor - never really needed to.

I just had a read of the article you linked to. There was a spreadsheet used to calc the numbers that went for the lowest error, so a similar technique.

The preprocessor can be handy for a number of compile-time tasks, like selecting the correct prescaler and top value for a timer given a desired frequency and a known system clock speed.

Or selecting to correct prescaler for the ADC based on similar constraints:

// Define bounds for the clkADC. #define ADC_CLK_MIN 50000UL #define ADC_CLK_MAX 200000UL // Derive appropriate ADC prescaler. // Again, some devices like the ATmega32HVE2/ATmega64HVE2 have a completely // different ADC, but (almost?) all ATmega and ATtiny ADC have a 3-bit // prescaler, and those three bits are in the same place in ADCSR or ADCSRA. #define ADC_CLK ((F_CPU) / ADC_PRESCALER) #define ADC_PRESCALER 2U #define ADC_PRESCALER_MASK 0x0U #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 4U #define ADC_PRESCALER_MASK 0x2U #endif #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 8U #define ADC_PRESCALER_MASK 0x3U #endif #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 16U #define ADC_PRESCALER_MASK 0x4U #endif #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 32U #define ADC_PRESCALER_MASK 0x5U #endif #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 64U #define ADC_PRESCALER_MASK 0x6U #endif #if (ADC_CLK > ADC_CLK_MAX) #undef ADC_PRESCALER #undef ADC_PRESCALER_MASK #define ADC_PRESCALER 128U #define ADC_PRESCALER_MASK 0x7U #endif #if (ADC_CLK < ADC_CLK_MIN) #error No ADC prescaler exists for ADC_CLK_MIN <= clkADC. #endif #if (ADC_CLK > ADC_CLK_MAX) #error No ADC prescaler exists for clkADC <= ADC_CLK_MAX. #endif

The challenge is: can the C preprocessor be used?

Not elegantly, if at all. There is no facility for iteration in the preprocessor.

Y'all need to decide if the below helps -- it makes my head hurt...

http://jhnet.co.uk/articles/cpp_...

C Pre-Processor Magic

The C Pre-Processor (CPP) is the somewhat basic macro system used by the C programming language to implement features such as

`#include`

and`#define`

which allow very simple text-substitutions to be carried out at compile time.In this article we abuse the humble #define`#define`

to implement if-statements and iteration.Before we begin, a disclaimer: these tricks, while perfectly valid C, should not be considered good development practice and should almost certainly not be used for "real work". That said it can totally be used for fun home-automation projects...

Also http://stackoverflow.com/questio... and links therein.

This one mentions Boost, and also C++ templates: http://stackoverflow.com/questio...

Those are the first three hits for a Google search on "c preprocessor iteration". All the first page hits seem to be pertinent to the topic.

I've come across those and similar in the past. They made my head hurt, too. Other than the intellectual exercise, I saw little use for them. I prefer sensible code ;-)

Apropos, @netizen touches on the Boost Preprocessor Library here:

I'm pretty sure I could do it, but not without more information:

What is the range of values for the hardware timer?

What is the range of values for the number of microseconds?

Is the number of microseconds supposed to be an integer?

If not, what precision is desired?

What is the range of possible ratios for timer values to microseconds?

Skeeve, there is not an issue in the calculation, but rather doing an iteration to find the closest solution using the C pre-processor. In answer to your questions:

Range of values of the hardware timer: unsigned 32bit, ie 0..2^32

Range of values for the number of microseconds: assume the h/w timer is ticking at 48-150MHz

Result as an integer

Precision: +/- a few %. The general idea is to find the closest fraction.

Possible range of ratios? Could vary. A general solution is desired.

I've had a browse of the links the others have posted. So I understand a little more about the pre-processor and the solution might be like putting a square peg into a round hole.

The more obvious solution is to use a script to process some tags in the source and insert the results. I'm seeing this become more commonplace and python seems to be a favourite.Just thinking of this raises the question of how we could put the python (or other script) in-line with the C/C++ code, shove it through a program, execute the script and pop out the result and then compile.

The python script would be run as part of the build. Make can handle this fairly easily. You set the dependencies so that the script runs whenever the file containing the appropriate constants has changed. The python script spits out a C header containing more macros which capture the results of the calculation, and the other source files have that header as a dependency as well. Make handles the rest.

But it sounds like perhaps you don't need a python script per se. A host-side C helper program could do the same job. I use the same sort of thing to calculate the correct coefficients to use for fast division as per:

http://embeddedgurus.com/stack-overflow/2009/06/division-of-integers-by-constants/

http://www.easysurf.cc/cnver17.htm#b10tob2

I've used something like this in the past:

// See http://embeddedgurus.com/stack-overflow/2009/06/division-of-integers-by-constants/ // and http://www.easysurf.cc/cnver17.htm#b10tob2 // // Example // ------- // // Common: // K = 1023 // 1/K = .00097751710654936461388074291300097751710654936461 // 1/K binary = 0.000000000100000000010000000001000000000100000000010000000001 // 1/K shifted = 0.100000000010000000001000000000100000000010000000001 // S=9 // // Recipe 1: // 33MSB = 100000000010000000001000000000100 // +1 = 100000000010000000001000000000101 // 32MSB = 10000000001000000000100000000010 // M = 0x80200802 // q_fast = (((uint64_t)A * (uint64_t)M) >> 32) >> S // // Recipe 2: // 34MSB = 1000000000100000000010000000001000 // +1 = 1000000000100000000010000000001001 // 33MSB = 100000000010000000001000000000100 // M = 0x100401004 // q_fast = ((((uint64_t)A * (uint64_t)M) >> 32) + A) >> 1) >> S; #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <stdbool.h> // Adjust M by +/- this much when testing. #define M_ADJUST 3 // Fast division formula. #define FAST_DIV(A, M, S) (((A * (uint64_t)M) >> 32) >> S) // Automate finding S and M. void find_constants(const long double k, uint8_t *s, uint32_t *m) { uint64_t one_over_k_fixed; one_over_k_fixed = 0x8000000000000000ULL / k; *s = 0; // Find S. while (!(one_over_k_fixed & 0x4000000000000000ULL)) { (*s)++; one_over_k_fixed <<= 1; } // 33 MSB. one_over_k_fixed >>= 30; // +1. one_over_k_fixed++; // 32 MSB rounded. one_over_k_fixed >>= 1; // M. *m = one_over_k_fixed; } int main(int argc, char **argv) { uint32_t test_range_min = 0; uint32_t test_range_max = -1; long double k; uint32_t a; uint32_t m; uint8_t s; uint32_t m_test; uint32_t q_real; uint32_t q_fast; uint32_t ok_cnt; uint32_t m_best = 0; uint32_t m_best_cnt = 0; bool fail = false; // Extract divisor and range from command line arguments. if (argc < 2) { fprintf(stderr, "USAGE: %s <divisor> [test-range-min] [test-range-max]\r\n", argv[0]); exit(2); } sscanf(argv[1],"%Lf", &k); if (argc >= 3) { sscanf(argv[2],"%u", &test_range_min); } if (argc >= 4) { sscanf(argv[3],"%u", &test_range_max); } // Find appropriate values of S and M. find_constants(k, &s, &m); // Start testing with a lower-than calculated value of M. m_test = m - M_ADJUST; fprintf(stderr, "Divide by %Lf.\r\n" "Testing with all dividends from %u to %u\r\n", k, test_range_min, test_range_max); // Test multiple candidate values above and below the calculated M. for (m_test=m-M_ADJUST; m_test<=(m+M_ADJUST); m_test++) { // Report progress. fprintf(stderr, "\rUsing: K=%Lf, S=%u, M=0x%08X", k, s, m_test); // Reset test. fail = false; ok_cnt = 0; // Test all dividends with this M. for (a=test_range_min; a<test_range_max; a++) { q_real = a / k; q_fast = FAST_DIV(a, m_test, s); // Stop testing this M if a test fails. if (q_real != q_fast) { fail = true; // Track 'best' fit so far. if (ok_cnt > m_best_cnt) { m_best = m_test; m_best_cnt = ok_cnt; } break; } else { ok_cnt++; } } // Stop searching when a value for M works for all tested dividends. if (!fail) { break; } } // Final report. if (fail) { fprintf(stderr, "\r\nNo exact solution found. Printing best fit.\r\n\n"); m = m_best; } else { m = m_test; fprintf(stderr, "\r\nAll results are good!\r\n\n"); } printf("#define FAST_DIV_S %u\n", s); printf("#define FAST_DIV_M %u\n", m); return 0; }

It computes the correct multiplicand and shift value for a fast-division formula, then tests it against the FP division for a specified range of integer dividends, adjusting the multiplicand with each pass. When a multiplicand is found which yields the same integer result for all dividends in that range as does the FP division, the program prints out the successful multiplicand and shift value. If it can't find one, it tells you so. Smaller ranges are more likely to succeed., so if you only need "Precision: +/- a few %", then you can specify a narrow range and be satisfied that the results will be sufficient.

It will work for any divisor, not just an integer one. So it should be accurate even with timer tick frequencies with non-integer megahertz values.

This one computes a 32-bit multiplicand so the fast-division formula involves 64-bit, so it might be too much for a wee micro. You can adapt it to 16/32, or 8/16 as you need.

I just tested it on my 2.1 GHz i3. Took < 1min to validate the full range of dividends from 0 to 2^32-1 for a divisor of 48.

I just quickly adapted it from an existing helper which pumped out a full C header file for another project, so it's a bit dirty, and I might have missed something.

If you need a windows binary, I can build one for you with mingw32.

EDIT: fixed typo which was a bug.

/* convert timer count to microseconds with a formula of the form us=(TCNT*FACTOR)>>P Assume timer frequency, 48...150 MHz, is known to the kilohertz, and one percent accuracy is sufficient. There is a usable FACTOR in the range 100..200 .From spararow2's analysis, 1+(UINT32_MAX-1)/2..UINT32_MAX might be better.FACTOR*2**-P should be roughly 1000/FKHZ, where FKHZ is the timer frequency in kilohertz , or 1000000/FHZ, where FHZ is the timer frequency in Hertz. 2**P should be roughly FACTOR*FKHZ/1000 >= FKHZ/10or FACTOR*FHZ/1000000 >= 1+((1+(UINT32_MAX-1)/2)*FKHZ-1)/1000000FACTOR should be roughly (2**P)*1000/FKHZor (2**P)*1000000/FHZ*/ #define FKHZ ... #define LG2ARG (((FKHZ)+9)/10)// or (1+((1+(UINT32_MAX-1)/2)*FKHZ-1)/1000000)#include "lg2.inc" enum { P=LG2hi } ; //ceiling(LG2ARG)~~ceiling(lg2(((FKHZ)+9)/10))~~// This part is straightforward. // Getting P was the tricky part. // The best FACTOR for the derived P: const unsigned long long FACTOR= ((1ULL<<P)*1000+FKHZ/2)/FKHZ ;// or ((1ULL<<P)*1000000+FKHZ/2)/FHZ... unsigned long long us=(TCNT*FACTOR)>>P; lg2.inc:// edit allowed 32 as a possibility#if 1<<32 <= LG2ARG #define LG2Z 32 #else #define LG2Z 0 #endif #if 1<<(LG2Z+16) <= LG2ARG #define LG2A (LG2Z+16) #else #define LG2A LG2Z #endif #if 1<<(LG2A+8) <= LG2ARG #define LG2B (LG2A+8) #else #define LG2B LG2A #endif #if 1<<(LG2B+4) <= LG2ARG #define LG2C (LG2B+4) #else #define LG2C LG2B #endif #if 1<<(LG2C+4) <= LG2ARG #define LG2D (LG2C+4) #else #define LG2D LG2C #endif #if 1<<(LG2D+2) <= LG2ARG #define LG2E (LG2D+2) #else #define LG2E LG2D #endif #if 1<<(LG2E+1) <= LG2ARG #define LG2lo (LG2E+1) #else #define LG2lo LG2E #endif #if 1<<LG2lo < LG2ARG #define LG2hi (lG2lo+1) #else #define LG2hi LG2lo #endif

Rather clunky, but it works.

Note that if P is required to be a multiple of 8, the problem simplifies to selecting one of four choices:

#define errr(q) ... #if errr(8)< errr(0) #define P 8 #else #define P 0 #endif #if errr(16)< errr(P) #undef P #define P 16 #endif #if errr(24)< errr(P) #undef P #define P 24 #endif

I do not know that I would want to use either one.

One reason to avoid code-generation is to avoid figuring out how to put it in a make file.

Not good enough.

I'd put the generated code in the object directory,

but would not give it the same name as its source.

E.g.

$(OBJ)/fred-gen.c: $(SRC)/fred.c awk -f $(SRC)/fred.awk $^ > $@ $(OBJ)/fred-gen.o: $(OBJ)/fred-gen.c $(CC) -o $@ ... $^

I've used awk instead of Python because I suspect it of being more widely available or at least present.

But if 1% error is ok it's simple:

Mul with the best 8 bit number (shifted to be in the range of 128 and 255)

Do the correct amount of shifts down. (that could be up depending of needed shifts(perhaps not for 8 bit haven't checked))

if your coefficient is an even number perhaps it better(read faster) to mul with 1/2 number (the result will be the same) (repeat if that is an even number aswell)

But the real question is the number of bits in the input (the timer), If normal C is used you should MUL with a number of same size. then the result is 2n bits and then do the shifts (back to n bits)

**sparrow2 wrote:**

But if 1% error is ok it's simple:

Mul with the best 8 bit number (shifted to be in the range of 128 and 255)

Do the correct amount of shifts down. (that could be up depending of needed shifts(perhaps not for 8 bit haven't checked))

so the shift will be down for any integral factor.

Getting either the shift or the factor without the other is tricky.

As most factors do not have useful shifts, I derive a shift from a lower limit on the factor.

It involves getting a base-2 log, rounded up.

Once one has the shift, getting the factor is easy.

if your coefficient is an even number perhaps it better(read faster) to mul with 1/2 number (the result will be the same) (repeat if that is an even number aswell)

If one wants a multiple of eight ((P+7)/8*8) is likely the number one wants.

But the real question is the number of bits in the input (the timer), If normal C is used you should MUL with a number of same size. then the result is 2n bits and then do the shifts (back to n bits)

My edits above used a clunkier formula.

I think you guys could be clearer when explaining your algorithms, it was easier for me to derive it from scratch rather than reading these posts.

I'll detail the algo here, even if just for my own future reference. I'll probably just forget this soon...

- We have a tick count
**n**and frequency**F**, in MHz. We want time**n/F**in microseconds. =**n/F*****n*****(2^k)/F**. This represents the well know method of turning a division into a multiply plus a shift (i.e. division by power of 2).**1/(2^k)**- Now we need to find the so called "magic number", factor
**(2^k)/F**. To find exponent**k**we apply**log2**to this expression:**log2 ((2^k)/F)**=**k**-**log2(F)**. - The integer part of
**log2(F)**is just the bit order of the MSB of the binary representation of**F**. For example, if**F**= 50 (decimal) = 110010 (binary), the MSB is BIT5, thus**log2(**50**)**= 5.xxx, this calculation is the trickiest part. x86 assembly has the BSR instruction for this. - For maximum precision, we want "magic number"
**(2^k)/F**to fit as close as possible in 8, 16, etc. bits. Let's say 16 bits, in other words, we want**log2 ((2^k)/F)**=**k**-**log2(F)**=**16**. In the previous example, with**F**= 50 MHz,**int(log2(F))**= 5, thus**k**=**16**+**5**=**21**. - Now we just need to calculate "magic number"
**(2^k)/F**by replacing**k**and**F**, in this example, we obtain "magic number" = 2^21/50 = 41943.04, this can be round up, down or nearest according to preference. - Thus the final calculation for
**F**=**50MHz**and**magic number size**=**16 bits**is: n * 41943, then shift right by 21 bits.

I hope this is helpful to someone. To implement as a macro, well, maybe some other day...

Thanks, I think your explanation brings a lot of clarity to the problem.

if the preprocessor don't have log2(x) remember that it's the same as log(x)/log(2) .

And to be sure I would use the magic numbers and range (in (32768..65535) ok , in(16384..32767) shift one .........), to make sure that the power of 2 numbers is correct.

I have only used this for ASM so I guess I look a bit different at the problem as #18, because for me >>21 is the same as <<3 and use the correct bytes, and if compiler don't see that >>5 can be used

I don't think that you save much over a div.

And remember if it's a tiny just use the div ;)

And about the rounding if your factor is smaller than the correct number you want to round up, if it's to big you want to round down.

Ok, I wrote some macros to calculate exponent and mantissa according to the algo in my previous post:

#define BIT(x) (1<<(x)) #define GET_MSB(x) (\ ((x) == 0)?(1/0):\ ((x) < BIT(1))?0:\ ((x) < BIT(2))?1:\ ((x) < BIT(3))?2:\ ((x) < BIT(4))?3:\ ((x) < BIT(5))?4:\ ((x) < BIT(6))?5:\ ((x) < BIT(7))?6:\ ((x) < BIT(8))?7:\ ((x) < BIT(9))?8:\ ((x) < BIT(10))?9:\ ((x) < BIT(11))?10:\ ((x) < BIT(12))?11:\ ((x) < BIT(13))?12:\ ((x) < BIT(14))?13:\ ((x) < BIT(15))?14:\ ((x) < BIT(16))?15:\ ((x) < BIT(17))?16:\ ((x) < BIT(18))?17:\ ((x) < BIT(19))?18:\ ((x) < BIT(20))?19:\ ((x) < BIT(21))?20:\ ((x) < BIT(22))?21:\ ((x) < BIT(23))?22:\ ((x) < BIT(24))?23:\ ((x) < BIT(25))?24:\ ((x) < BIT(26))?25:\ ((x) < BIT(27))?26:\ ((x) < BIT(28))?27:\ ((x) < BIT(29))?28:\ ((x) < BIT(30))?29:\ ((x) < BIT(31))?30:31\ ) #define DESIRED_BITS 24 #define EXPONENT(frequency) (DESIRED_BITS + GET_MSB(frequency)) #define MANTISSA(frequency) ((1<<EXPONENT(frequency))/(frequency))

DESIRED_BITS holds the size you want for the magic number, here I put 24, could be 8 or 16. Size 32 may or may not work, depending on the internal representation of the compiler.

edit: there was a bug in the GET_MSB macro, corrected and added the "feature" of generating a division by zero if frequency == 0.