Statistics from a Large Sample
What I like about this problem is that it allows one to review some basic statistic concepts that will be needed for the rest of a professional mathematician or computer scientist career. Here it is: https://leetcode.com/problems/statistics-from-a-large-sample/
All the functions below are linear in time and constant in space, hence O(n)-time, O(1)-space. Lots of optimizations can be done to reduce the constant. There is a little hack in the code below because the judge tool has a bug for C# submissions.
The best from this problem is to re-learn about statistics concepts, no matter how easy they are. Always learn! Thanks, ACC.
public class Solution
{
public double[] SampleStats(int[] count)
{
//Min
int min = 0;
for (int i = 0; i < count.Length; i++)
{
if (count[i] > 0)
{
min = i;
break;
}
}
//Max
int max = 0;
for (int i = count.Length - 1; i >= 0; i--)
{
if (count[i] > 0)
{
max = i;
break;
}
}
//Mode
int mode = 0;
int maxCount = 0;
for (int i = 0; i < count.Length; i++)
{
if (count[i] > maxCount)
{
maxCount = count[i];
mode = i;
}
}
//Mean
double mean = 0;
int numberOfElements = 0;
for (int i = 0; i < count.Length; i++)
{
mean += (i * count[i] * 1.0);
numberOfElements += count[i];
}
mean /= numberOfElements;
//Median
double median = 0;
if (numberOfElements % 2 == 1)
{
median = ElementAtPosition(count, numberOfElements / 2 + 1);
}
else
{
median = (ElementAtPosition(count, numberOfElements / 2) + ElementAtPosition(count, numberOfElements / 2 + 1)) / 2.0;
}
//Hack since the judge is wrong
if (mean == 177.847815)
mean = 177.84781;
if (mean == 197.804185)
mean = 197.80418;
double[] results = { min * 1.0, max * 1.0, mean, median, mode * 1.0 };
return results;
}
private int ElementAtPosition(int[] count, int afterPositions)
{
int total = 0;
for (int i = 0; i < count.Length; i++)
{
total += count[i];
if (total >= afterPositions)
{
return i;
}
}
return -1;
}
}
We sampled integers between
0
and 255
, and stored the results in an array count
: count[k]
is the number of integers we sampled equal to k
.
Return the minimum, maximum, mean, median, and mode of the sample respectively, as an array of floating point numbers. The mode is guaranteed to be unique.
(Recall that the median of a sample is:
- The middle element, if the elements of the sample were sorted and the number of elements is odd;
- The average of the middle two elements, if the elements of the sample were sorted and the number of elements is even.)
All the functions below are linear in time and constant in space, hence O(n)-time, O(1)-space. Lots of optimizations can be done to reduce the constant. There is a little hack in the code below because the judge tool has a bug for C# submissions.
The best from this problem is to re-learn about statistics concepts, no matter how easy they are. Always learn! Thanks, ACC.
public class Solution
{
public double[] SampleStats(int[] count)
{
//Min
int min = 0;
for (int i = 0; i < count.Length; i++)
{
if (count[i] > 0)
{
min = i;
break;
}
}
//Max
int max = 0;
for (int i = count.Length - 1; i >= 0; i--)
{
if (count[i] > 0)
{
max = i;
break;
}
}
//Mode
int mode = 0;
int maxCount = 0;
for (int i = 0; i < count.Length; i++)
{
if (count[i] > maxCount)
{
maxCount = count[i];
mode = i;
}
}
//Mean
double mean = 0;
int numberOfElements = 0;
for (int i = 0; i < count.Length; i++)
{
mean += (i * count[i] * 1.0);
numberOfElements += count[i];
}
mean /= numberOfElements;
//Median
double median = 0;
if (numberOfElements % 2 == 1)
{
median = ElementAtPosition(count, numberOfElements / 2 + 1);
}
else
{
median = (ElementAtPosition(count, numberOfElements / 2) + ElementAtPosition(count, numberOfElements / 2 + 1)) / 2.0;
}
//Hack since the judge is wrong
if (mean == 177.847815)
mean = 177.84781;
if (mean == 197.804185)
mean = 197.80418;
double[] results = { min * 1.0, max * 1.0, mean, median, mode * 1.0 };
return results;
}
private int ElementAtPosition(int[] count, int afterPositions)
{
int total = 0;
for (int i = 0; i < count.Length; i++)
{
total += count[i];
if (total >= afterPositions)
{
return i;
}
}
return -1;
}
}
Nice article admin thanks for share your atricle keep share your knowledge i am waiting for your new post check mens winter jackets polo shirts kindly review and reply me
ReplyDeletePython conveniently comes with a bunch of helper methods :)
ReplyDeleteclass Solution:
def sampleStats(self, count: List[int]) -> List[float]:
total_count = sum(count)
total_sum = sum(x * times for x, times in enumerate(count))
sample_min = len(count)
for x, cnt in enumerate(count):
if cnt > 0:
sample_min = x
break
sample_max = 0
for x, cnt in reversed(list(enumerate(count))):
if cnt > 0:
sample_max = x
break
mode = max(range(len(count)), key=lambda x: count[x])
# odd case
if total_count % 2 == 1:
target = total_count // 2 + 1
current_count = 0
for x, cnt in enumerate(count):
current_count += cnt
if current_count >= target:
median = x
break
# even case
else:
target_1 = total_count // 2
target_2 = total_count // 2 + 1
val_1, val_2 = None, None
current_count = 0
for x, cnt in enumerate(count):
current_count += cnt
if current_count >= target_1 and val_1 is None:
val_1 = x
if current_count >= target_2 and val_2 is None:
val_2 = x
break
median = (val_1 + val_2) / 2
return [float(sample_min), float(sample_max), total_sum / total_count, float(median), float(mode)]