An answer to this question on Stack Overflow.
Question
I'm not a specialist, but as far as I know, a bit of information in a QR-code is coded more than once, and it is defined as the redundancy level
How can I estimate a QR-code redundancy level ? Is where an mobile app or a website where I can test my QR-code redundancy level easily ? If not, is it an easy algorithm that I can implement ?
Redundancy is sorted in different categories according to this website, but I'd like to have the direct percentage value if possible
Answer
QR codes contain a couple of bits which indicate the error correction level, as depicted below (source):
An answer to this question on Stack Overflow.
Question
The sigmoid function is defined as
S(t) = 1 / (1 + e^(-t))
(where ^ is pow)
I found that using the C built-in function exp() to calculate the value of f(x) is slow. Is there any faster algorithm to calculate the value of f(x)?
Answer
This question has significantly more detail including these benchmarked results:
name rms_error maxdiff time_us speedup samples
logistic_with_tanh 5.9496e-02 1.5014e-01 0.0393 0.5076 200000001
logistic_with_atan 3.9051e-02 9.6934e-02 0.0321 0.6211 200000001
logistic_with_erf 6.5068e-02 1.6581e-01 0.0299 0.6676 200000001
logistic_fexp_quintic_approx 1.2921e-07 5.9050e-07 0.0246 0.8118 200000001
logistic_product_approx_float128 8.7032e-04 1.7217e-03 0.0209 0.9523 200000001
logistic_with_exp_no_overflow 4.7660e-17 1.6653e-16 0.0198 1.0084 200000001
logistic_product_approx128 8.7032e-04 1.7211e-03 0.0164 1.2187 200000001
log_w_approx_exp_no_overflow128 8.7193e-04 1.7211e-03 0.0158 1.2640 200000001
logistic_with_sqrt 8.3414e-02 1.1086e-01 0.0146 1.3662 200000001
log_w_approx_exp_no_overflow16 6.9726e-03 1.4074e-02 0.0141 1.4114 200000001
log_w_approx_exp_no_overflow16_clamped 6.9726e-03 1.4074e-02 0.0141 1.4153 200000001
logistic_schraudolph_approx 1.5661e-03 8.9906e-03 0.0138 1.4497 200000001
logistic_with_abs 6.0968e-02 8.2289e-02 0.0134 1.4936 200000001
logistic_orig 0.0000e+00 0.0000e+00 0.0199 ------ 200000001
An answer to this question on Stack Overflow.
Question
Problem background
I have a 2D array
Map[Height][Width]that stores aboolto represent each cell.There exists two non-overlapping regions
RegionAandRegionB.Each region is a list of unique adjacent integer co-ordinates.
The number of co-ordinates in
RegionAandRegionBaremandnrespectivelyA co-ordinate in
RegionAwill never be horizontally or vertically adjacent to a coordinate inRegionB(i.e. the regions do not touch)The bounding box of
RegionAcould overlap with the bounding box ofRegionBRegionAandRegionBmay have holesRegionAmay or may not surroundRegionB(and vise versa)if a co-ordinate (X, Y) is part of a region then
Map[Y][X] == 1, otherwise its zero.
Problem
I'm looking for an algorithm to determine the two co-ordinates A and B that have the minimum Euclidean distance, where A is a member of RegionA and B is a member of RegionB.
The brute force method has a time complexity of O(mn).
Is there a more time-efficient algorithm (preferably better than O(mn)) for solving this problem?
The name + link, code, or description of one instance of such an algorithm would be greatly appreciated and will be accepted as an answer.
C++ code to create two random regions A and B
The region generation code is immaterial to the problem being asked. and I am only interested in how close the regions are to each other. In the project I am working on I used cellular automata to create the regions. I can include the code for this if necessary. The Region construction procedure exists only to create relevant examples.
#include <cstdlib>
#include <cstring>
#include <vector>
#include <cstdio>
#include <map>
#include <ctime>
#include <cmath>
constexpr int Width = 150;
constexpr int Height = 37;
bool Map[Height][Width];
struct coord {
int X, Y;
coord(int X, int Y): X(X), Y(Y) {}
bool operator==(const coord& Other) {
return Other.X == X && Other.Y == Y;
}
bool IsAdjacentTo(const coord& Other){
int ManhattanDistance = abs(X - Other.X) + abs(Y - Other.Y);
return ManhattanDistance == 1;
}
bool IsOnMapEdge(){
return X == 0 || Y == 0 || X == Width - 1 || Y == Height - 1;
}
};
using region = std::vector<coord>;
void ClearMap(){
std::memset(Map, 0, sizeof Map);
}
bool& GetMapBool(coord Coord) {
return Map[Coord.Y][Coord.X];
}
int RandInt(int LowerBound, int UpperBound){
return std::rand() % (UpperBound - LowerBound + 1) + LowerBound;
}
region CreateRegion(){
std::puts("Creating Region");
ClearMap();
region Region;
int RegionSize = RandInt(50, Width * Height / 2);
Region.reserve(RegionSize);
while (true){
Region.emplace_back(
RandInt(1, Width - 1),
RandInt(1, Height - 1)
);
if (!Region[0].IsOnMapEdge())
break;
else
Region.pop_back();
}
GetMapBool(Region[0]) = 1;
while (Region.size() != RegionSize){
coord Member = Region[RandInt(0, Region.size() - 1)];
coord NeighbourToMakeAMember {
RandInt(-1, 1) + Member.X,
RandInt(-1, 1) + Member.Y
};
if (!Member.IsOnMapEdge() && !GetMapBool(NeighbourToMakeAMember) && Member.IsAdjacentTo(NeighbourToMakeAMember)){
GetMapBool(NeighbourToMakeAMember) = 1;
Region.push_back(NeighbourToMakeAMember);
};
}
std::puts("Created Region");
return Region;
}
std::pair<region, region> CreateTwoRegions(){
bool RegionsOverlap;
std::pair<region, region> Regions;
do {
Regions.first = CreateRegion();
Regions.second = CreateRegion();
ClearMap();
for (coord Member : Regions.first){
GetMapBool(Member) = 1;
}
RegionsOverlap = 0;
for (coord Member : Regions.second){
if (GetMapBool(Member)){
ClearMap();
std::puts("Regions Overlap/Touch");
RegionsOverlap = 1;
break;
} else {
GetMapBool(Member) = 1;
}
}
} while (RegionsOverlap);
return Regions;
}
void DisplayMap(){
for (int Y = 0; Y < Height; ++Y){
for (int X = 0; X < Width; ++X)
std::printf("%c", (Map[Y][X] ? '1' : '-'));
std::puts("");
}
}
int main(){
int Seed = time(NULL);
std::srand(Seed);
ClearMap();
auto[RegionA, RegionB] = CreateTwoRegions();
DisplayMap();
}
Problem illustration
What are the co-ordinates of the points A and B that form the minimum Euclidean distance between the two regions?
Answer
Notes on finding the nearest points between two regions
As others have mentioned, you can improve on finding the closest points between the two regions of n and m points by first filtering each region's points to only those which border unassigned cells and then doing an all-pairs search between the boundaries. For a roughly rectangular region this would take O(m + n + 16 √m √n) time: two linear filters and then the all-pairs search. I've implemented an example of this search below.
But this will not save you any time if your regions are shaped like this:
########
# #
# #### #
# # # #
# # ## #
# # #
# ######
If you are using larger datasets and can tolerate preprocessing you can get further acceleration using a 2D k-d tree.
Construction takes O(n log n) time after which you can search the tree in O(log n) time for m queries. So the total time to find the closest points is then O(m log n + n log n) or O(n log m + m log m).
You can do some benchmarking to determine if it is better to build the k-d tree for the larger or the smaller of the regions. Remember that this answer may change depending on how many times you're able to reuse the same k-d tree for queries against different regions.
For small datasets you'll probably find that the all-pairs comparison is faster because it has much greater cache locality.
Note that implementing a k-d tree is tricky, so you'll probably want to use an existing library or definitely use test cases if you're rolling your own.
Another algorithm
If we happen to know that the regions are within a distance d of each other we can get an even faster algorithm.
For one of the regions we can build a hash table where a reference to each member cell is stored at a hash location (c.y // d) * height + (c.x // d). To find the nearest neighbor to a query point we then do an all-pairs check against the points referenced in the associated "meta-cell" and as well as the meta-cell's neighbors.
Notes on region building
On the off-chance you're trying to grow the regions quickly, but their boundaries can overlap, this should do the trick.
Note that your question can be improved by stating why you are trying to calculate a thing. If you want to know the closest-cell so you have a primitive to build regions more quickly, then reconsidering how region building works might be a better use of your time, and it's what I address here.
If region generation is immaterial and you're only interested in how close they are to each other, then focus on defining, conceptually, what a region is and be very clear that your region construction procedure exists only to create relevant examples.
The fundamental trick we'll use to accelerate region construction is to make a space-time tradeoff. Rather than storing the map as a set of boolean values, we'll store it as a set of 8-bit integers. We reserve a few of these integers for special purposes:
- 0 - An unassigned space that a region may be built on
- 1 - A buffer cell that isn't part of any region, but close to one. This is not available for building on.
- 2, 3, 4, ... - valid region labels.
Thus, the algorithm for building a region becomes simple:
- Find a valid seed cell
- Grow outward from the seed cell via 4-connected cells until the desired number of cells is reached
- A new Region Cell can only be placed in an Unassigned Space
- Iterate over Region Cells marking their 8-connected neighbors as Buffer cells.
- Note that, if we add the new Buffer cells to the list of Region Cells and repeat the last step, we can grow arbitrarily large buffers between regions.
A few other parts of your code needed cleaning. Types are typically defined using Camel Case and variables use snake case. I prefer snake_case for functions as well. See a style guide for more opinions.
Do not use global variables they are bad and, in the fullness of time, they will bring you nothing but sorrow.
You're using C++, avoid using headers such as stdio, stdlib, and cstring. They provide the C way of doing things, which often come with disadvantageous such as diminished type safety.
Any time your if statements have a lot of conditions, ask yourself whether they can be simplified by inverting one or more of the conditions (that is, switching is_good() to !is_good()) and doing an early exit. This can make your code much easier to read and reasonable about.
An answer to this question on Stack Overflow.
Question
There are 3 (which I know) ways to suppress the "unused variable" warning. Any particular way is better than other ?
First
- (void)testString:(NSString *)testString
{
(void)testString;
}
Second
- (void)testString:(NSString *)__unused testString
{
}
Third
- (void)testString:(NSString *)testString
{
#pragma unused(testString)
}
Answer
In C23 you can write
[[maybe_unused]] int somevar;
(details)
I recently had a genetic panel done. Hundreds of genes were tested and I came back as a carrier for a couple of things. This isn't surprising:
- Lazarin et al (2012) find that "24% of individuals were identified as carriers for at least one of 108 disorders, and 5.2% were carriers for multiple disorders".
- Fridman et al (2021)00088-4) find that "based on 6,447 exome sequences of healthy, genetically unrelated Europeans of two distinct ancestries, we estimate that every individual is a carrier of at least 2 pathogenic variants in currently known autosomal-recessive (AR) genes and that 0.8%–1% of European couples are at risk of having a child affected with a severe AR genetic disorder.
Updates from positive tests
One of the genes I'm a carrier for is associated with Glycogen storage disease type II.
Knowing I'm a carrier means it's straight-forward to use Mendelian genetics to determine that my siblings have a 50% chance of being carriers and my niblings have a 25% chance of being carriers.
The background incidence of this mutation is ~1 in 530, which are also the odds a partner would be a carrier. If we had a child, there'd be a 25% chance said child would inherit the mutation from both of us and be affected as a result (ignoring the mysteriousness of incomplete penetrance).
Multiple mutations can cause GSDII, so its background rate is ~1 in 40,000. Since I only have the one mutation, pre-test the odds that a randomly selected partner and I would have had a child with GSDII were $\frac{1}{530}\cdot\frac{1}{4}=\frac{1}{2,120}$.
Note that this only accounts for a partner carrying the same variant I do. GSDII can also arise from compound heterozygosity — my child inheriting my mutation from me and a different GAA mutation from my partner. Since the disease requires two carrier parents who each pass on the mutation, the disease incidence equals $q^2 \cdot \frac{1}{4} = \frac{1}{40,000}$, giving an overall carrier frequency of $q = \frac{1}{100}$. That's considerably higher than the 1/530 rate for my specific variant, so the total pre-test risk is higher than 1/2,120. Either way, post-test these odds can be reduced to essentially zero if my partner were to do similar testing.
For my siblings, the odds of their children having GSDII with a randomly selected partner are $\frac{1}{530}\cdot\frac{1}{2}\cdot\frac{1}{4}=\frac{1}{4,240}$ and the odds of having a carrier child are $\frac{1}{530}\cdot\frac{1}{2}\cdot\frac{1}{2}=\frac{1}{2,120}$. The additional $\frac{1}{2}$ term being the odds they are also carriers, given Mendelian genetics. This is, of course, much higher than the $\frac{1}{40,000}$. So the test results in a big update for their state of knowledge.
I also tested negative for many things. For autosomal recessive) disorders that means there's a ~0% chance of my children having them sans de novo mutation.
Updates from negative tests
Positive carrier results give clear and significant updates for my siblings. But what about all the things I tested negative for — can my siblings learn anything from those results too? The answer is less obvious and requires a bit of work.
To figure this out, we need to pull out the Bayesian blender and pour (a) the background rate of the condition, (b) my parents' possible genotypes, and (c) Mendelian inheritance into it.
Let's call $q$ the frequency of carriers of a mutation in the population and $p=1-q$ the frequency of non-carriers.
Now, let's work through my parents' potential genotypes. (Note: in the following "non-carrier" means not heterozygous. Strictly, this lumps together homozygous wild-type (AA) and homozygous affected (aa) individuals, but for rare diseases the aa frequency is negligible and the distinction doesn't matter.)
- Both parents non-carriers
- Prior probability: $p^2$
- Probability I'm a non-carrier given this scenario = 1
- Probability my siblings are non-carriers = 1
- One parent's a carrier, one's a non-carrier:
- Prior probability: $2pq$
- Probability I'm a non-carrier given this scenario = 1/2
- Probability my siblings are non-carriers = 1/2
- Both of my parents were carriers
- Prior probability: $q^2$
- Probability I'm a non-carrier given this scenario = 1/4
- Probability my siblings are non-carriers = 1/4
Now, Bayes' theorem says that
P(parental genotype | I'm a non-carrier) =
P(I'm a non-carrier | parental genotype)
* P(parental genotype) / P(I'm a non-carrier)
We have P(I'm a non-carrier) = $p^2\cdot1 + 2pq\cdot\frac{1}{2} + q^2\cdot\frac{1}{4}=p^2+pq+q^2/4$.
Therefore, we have:
- P(both parents non-carriers | Me non-carrier) = $\frac{p^2}{p^2+pq+q^2/4}$
- P(one parent carrier | Me non-carrier) = $\frac{pq}{p^2+pq+q^2/4}$
- P(both carrier | Me non-carrier) = $\frac{q^2}{4}\frac{1}{p^2+pq+q^2/4}$
Now,
P(sib non-carrier) = Σ[P(sibling is non-carrier | parental genotype) * P(parental genotype | I'm a non-carrier)]
plugging it all in we get: $$ P(\textrm{sib non-carrier}) = \frac{p^2 + pq/2 + q^2/16}{p^2+pq+q^2/4} $$
The numerator and denominator are both perfect squares: $p^2 + pq/2 + q^2/16 = (p+q/4)^2$ and $p^2+pq+q^2/4=(p+q/2)^2$, so this simplifies to
$$ P(\textrm{sib non-carrier}) = \left(\frac{p+q/4}{p+q/2}\right)^2 $$
Note that this derivation assumes full siblings sharing both parents; the update differs for half-siblings.
A mutation I don't have is Sandhoff disease which has a carrier frequency of between 1:310 to 1:276
Choosing the higher value, q=1/276, we plug and chug to get P(sib non-carrier)=99.8%. Before I did this test, the population base rate gives the prior odds that my siblings would be non-carriers: p=99.6%. So the test represents a 0.2pp update for them.
Plotting the magnitude of the update across all possible carrier frequencies gives us a sense of when negative results start to matter:
set terminal svg size 800,500 font "Arial,14"
set output "autosomal-recessive-bayesian-update.svg"
set xrange [0:1]
set xlabel "Frequency"
set ylabel "Update given I'm not a carrier (pp)"
plot 100*((((1-x)+x/4)/((1-x)+x/2))**2 - (1-x)) t ""
From this we see that we'd need a carrier frequency of 20-40% before we got significant updates.
We can also use this to double-check our work. At 0% frequency the update goes to 0 as it should. At the other extreme, consider q=99.99%. In this case both parents are almost certainly carriers, and my sibling's prior probability of being a non-carrier is essentially p ≈ 0%. My negative test is surprising — it's strong evidence that the Aa × Aa parental cross can produce wild-type offspring — and bumps the sibling's posterior up to roughly 25%, an update of about 25pp.


